Binary.com is the award-winning leader in online binary options trading. Our platform allows customers to trade currencies, stocks, and commodities.
Read More
Binary.com is the award-winning leader in online binary options trading. Our platform allows customers to trade currencies, stocks, and commodities.
Founded in 1999, we are one of the oldest and most respected names in our industry. Today, we have over 1 million registered accounts around the world, and trade over US$1 billion in options transactions per year.
Our business is growing strong. Over the last five years, we have grown more than 20-fold. We expect to grow even further in the coming years.
Binary.com is seeking a driven, analytical, and technically-gifted Quantitative Analyst to spur our future growth. Your role in the Quantitative Analytics group is essential. You will develop and optimise derivatives pricing and risk management algorithms for our online options trading platform. Your work will directly influence the profitability and success of our company.
Our culture
Binary.com is one of the IT world’s most vibrant and progressive workplaces. We blend the entrepreneurial spirit of a startup, with the profitability and stability of a long-running business.
Our company is built on a culture that fosters teamwork, individuality, and creativity.
We care deeply about cultural and gender diversity. We go to great lengths to foster a supportive, caring environment within a flat organisational structure.
We value staff with a sense of fun and adventure, who are optimistic, and customer focused. Above all, you must agree with our strong emphasis on business integrity.
Your skills and motivations
You should be passionate about what Binary.com stands for, and what we do.
You excel at the intersection of mathematics, programming, and finance. You are fascinated by the convergence of all three fields, and how it can push you to scale new heights.
You thrive in a fast-paced environment that values collaboration and open communication. You enjoy the intellectual stimulation provided by mathematical models, data analysis, and financial research.
As our Quantitative Analyst, you will have the opportunity to develop derivatives pricing, and risk management models and algorithms. Your work will directly influence the profitability of our trading platform, and future success of our company.
We are looking for someone who loves to:
● Apply mathematical models to real-world scenarios.
● Solve complex and abstract mathematical problems to optimise pricing and manage risk.
● Analyse trading patterns to identify new market opportunities.
● Work with highly talented people in an exciting, multinational environment.
● Do great work, and inspires people around them to do the same.
● Get things done in a no-nonsense manner.
● Work without bureaucracy and hierarchy.
● Learn and improve, day in and day out.
To excel in this role, you must have:
● An advanced university degree in Physics, Financial Engineering or Mathematics.
● Experience in exotic options pricing, volatility forecasts, high-frequency trading, and the analysis of market inefficiencies.
● Knowledge of probability theory, stochastic calculus, numerical methods, Monte-Carlo simulation, differential equations, econometrics, and statistical modelling.
● Expertise in the application of object-oriented programming languages (C++, Perl, and Java), coupled with the ability to produce high-quality code.
● Experience in using financial information sources such as Bloomberg and Reuters.
● Relevant experience in the use of quant programming libraries and frameworks (QuantLib, Pricing Partners, FINCAD, and Numerix), and quant pricing platforms (SuperDerivatives and FENICS) would be a plus.
Your role
Binary.com’s Quantitative Analytics team is responsible for the pricing of our binary options. You will join them in managing the risk and profitability of the company’s options book.
The work that you do is complex, challenging, and essential to our future.
We process over a million transactions each day, and manage a book of exotic options which exceeds the complexity of the typical derivatives desk.
Since all transactions on the Binary.com website are fully automated, our pricing and risk management algorithms must fully consider critical factors such as real-time pricing parameters, data feed irregularities, and latencies.
You will:
● Develop derivatives pricing, risk management models, and algorithms using C/C++, R, MATLAB, Perl, Python, and Java.
● Review, develop, and enhance Perl, C++, and R codes used in options pricing, volatility forecasts, and risk management programs.
● Maintain accurate system pricing parameters.
● Perform data mining using SQL databases, R/S-Plus, OLAP, and other analytical tools.
● Monitor website trading activity and minimise abuse.
● Generate periodic and special reports that summarise client trading trends.
Remuneration and benefits
This position includes a market-based salary, annual performance bonus, and health benefits. You will also receive travel and Internet allowances.
You will enjoy a casual dress code and flexi hours. You also have the freedom to select your preferred tools and systems.
We will also assist you with your work permit, and relocation for your family.
Location
This position is based at our operational headquarters in Cyberjaya, Malaysia.
Cyberjaya is an exciting high-tech precinct south of Kuala Lumpur, the capital of Malaysia. The benefits of working in Cyberjaya include a low cost of living, and modern infrastructure. You’ll also have easy access to some of Asia’s most spectacular scenery.
To support further growth, we are setting up an office in central Kuala Lumpur. You’ll have an opportunity to accomplish amazing things in one of the world’s most exciting cities.
Below are the questionaire. Here I created this file to apply MCMCpack and forecast to compelete the questions prior to completed the Ridge, ElasticNet and LASSO regression (quite alot of models for comparison)1 We can use cv.glmnet() in glmnet package or caret package for cross validation models. You can refer to Algorithmic Trading and Successful Algorithmic Trading which applied cross-validation in focasting in financial market. You can buy the ebook with full Python coding of Successful Algorithmic Trading as well..
2. Content
2.1 Question 1
2.1.1 Read Data
I use 3 years data for the question as experiment, 1st year data is burn-in data for statistical modelling and prediction purpose while following 2 years data for forecasting and staking. There have 252 trading days within a year.
## get currency dataset online.
## http://stackoverflow.com/questions/24219694/get-symbols-quantmod-ohlc-currency-data
#'@ getFX('USD/JPY', from = '2014-01-01', to = '2017-01-20')
## getFX() doesn't shows Op, Hi, Lo, Cl price but only price. Therefore no idea to place bets.
#'@ USDJPY <- getSymbols('JPY=X', src = 'yahoo', from = '2014-01-01',
#'@ to = '2017-01-20', auto.assign = FALSE)
#'@ names(USDJPY) <- str_replace_all(names(USDJPY), 'JPY=X', 'USDJPY')
#'@ USDJPY <- xts(USDJPY[, -1], order.by = USDJPY$Date)
#'@ saveRDS(USDJPY, './data/USDJPY.rds')
USDJPY <- read_rds(path = './data/USDJPY.rds')
mbase <- USDJPY
## dateID
dateID <- index(mbase)
dateID0 <- ymd('2015-01-01')
dateID <- dateID[dateID > dateID0]
dim(mbase)
## [1] 797 6
summary(mbase) %>% kable(width = 'auto')
Index
USDJPY.Open
USDJPY.High
USDJPY.Low
USDJPY.Close
USDJPY.Volume
USDJPY.Adjusted
Min. :2014-01-01
Min. : 99.89
Min. :100.4
Min. : 99.57
Min. : 99.91
Min. :0
Min. : 99.91
1st Qu.:2014-10-07
1st Qu.:103.18
1st Qu.:103.6
1st Qu.:102.79
1st Qu.:103.19
1st Qu.:0
1st Qu.:103.19
Median :2015-07-13
Median :112.50
Median :113.0
Median :112.03
Median :112.49
Median :0
Median :112.49
Mean :2015-07-12
Mean :111.95
Mean :112.3
Mean :111.53
Mean :111.95
Mean :0
Mean :111.95
3rd Qu.:2016-04-18
3rd Qu.:119.76
3rd Qu.:120.1
3rd Qu.:119.25
3rd Qu.:119.78
3rd Qu.:0
3rd Qu.:119.78
Max. :2017-01-20
Max. :125.60
Max. :125.8
Max. :124.97
Max. :125.63
Max. :0
Max. :125.63
2.1.2 Statistical Modelling
2.1.2.1 ARIMA vs ETS
Remarks :Here I try to predict the sell/buy price and also settled price. However just noticed the question asking about prediction of the variance2 The profit is made based on the range of variance Hi-Lo price but not the accuracy of the highest, lowest or closing price. based on mean price. I can also use the focasted highest and forecasted lowest price for variance prediction as well. However I will conduct another study and answer for the variance with Garch models.
Below are some articles with regards exponential smoothing.
It is a common myth that ARIMA models are more general than exponential smoothing. While linear exponential smoothing models are all special cases of ARIMA models, the non-linear exponential smoothing models have no equivalent ARIMA counterparts. There are also many ARIMA models that have no exponential smoothing counterparts. In particular, every ETS model3forecast::ets() : Usually a three-character string identifying method using the framework terminology of Hyndman et al. (2002) and Hyndman et al. (2008). The first letter denotes the error type (“A”, “M” or “Z”); the second letter denotes the trend type (“N”,“A”,“M” or “Z”); and the third letter denotes the season type (“N”,“A”,“M” or “Z”). In all cases, “N”=none, “A”=additive, “M”=multiplicative and “Z”=automatically selected. So, for example, “ANN” is simple exponential smoothing with additive errors, “MAM” is multiplicative Holt-Winters’ method with multiplicative errors, and so on. It is also possible for the model to be of class “ets”, and equal to the output from a previous call to ets. In this case, the same model is fitted to y without re-estimating any smoothing parameters. See also the use.initial.values argument. is non-stationary, while ARIMA models can be stationary.
The ETS models with seasonality or non-damped trend or both have two unit roots (i.e., they need two levels of differencing to make them stationary). All other ETS models have one unit root (they need one level of differencing to make them stationary).
The following table gives some equivalence relationships for the two classes of models.
ETS model
ARIMA model
Parameters
\(ETS(A, N, N)\)
\(ARIMA(0, 1, 1)\)
\(θ_{1} = α − 1\)
\(ETS(A, A, N)\)
\(ARIMA(0, 2, 2)\)
\(θ_{1} = α + β − 2\)
\(θ_{2} = 1 − α\)
\(ETS(A, A_{d}, N)\)
\(ARIMA(1, 1, 2)\)
\(ϕ_{1} = ϕ\)
\(θ_{1} = α + ϕβ − 1 − ϕ\)
\(θ_{2} = (1 − α)ϕ\)
\(ETS(A, N, A)\)
\(ARIMA(0, 0, m)(0, 1, 0)_{m}\)
\(ETS(A, A, A)\)
\(ARIMA(0, 1, m+1)(0, 1, 0)_{m}\)
\(ETS(A, A_{d}, A)\)
\(ARIMA(1, 0, m+1)(0, 1, 0)_{m}\)
For the seasonal models, there are a large number of restrictions on the ARIMA parameters.
## Here I test the accuracy of forecasting of ets ZZZ model 1.
## Test the models
## opened price fit data
summary(lm(Point.Forecast~ USDJPY.Close, data = fitETS.op))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitETS.op)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.4353 -0.4004 -0.0269 0.3998 3.3978
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.180332 0.490019 0.368 0.713
## USDJPY.Close 0.998722 0.004256 234.666 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7286 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9904, Adjusted R-squared: 0.9904
## F-statistic: 5.507e+04 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitETS.op))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) 0.1808 0.489606 4.896e-03 4.896e-03
## USDJPY.Close 0.9987 0.004257 4.257e-05 4.257e-05
## sigma2 0.5330 0.033014 3.301e-04 3.301e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) -0.7795 -0.1487 0.1848 0.5094 1.1441
## USDJPY.Close 0.9904 0.9959 0.9987 1.0016 1.0070
## sigma2 0.4716 0.5100 0.5317 0.5549 0.6009
## highest price fit data
summary(lm(Point.Forecast~ USDJPY.Close, data = fitETS.hi))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitETS.hi)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.3422 -0.3298 -0.0987 0.2166 3.2868
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.140616 0.379253 3.008 0.00276 **
## USDJPY.Close 0.993982 0.003294 301.765 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5639 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9942, Adjusted R-squared: 0.9942
## F-statistic: 9.106e+04 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitETS.hi))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) 1.1410 0.378933 3.789e-03 3.789e-03
## USDJPY.Close 0.9940 0.003295 3.295e-05 3.295e-05
## sigma2 0.3193 0.019776 1.978e-04 1.978e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) 0.3978 0.8860 1.1441 1.3953 1.8865
## USDJPY.Close 0.9875 0.9918 0.9939 0.9962 1.0004
## sigma2 0.2825 0.3055 0.3185 0.3324 0.3599
## mean price fit data (mean price of daily highest and lowest price)
summary(lm(Point.Forecast~ USDJPY.Close, data = fitETS.mn))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitETS.mn)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.55047 -0.26416 -0.00996 0.26743 1.81654
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.106616 0.326718 0.326 0.744
## USDJPY.Close 0.999098 0.002838 352.091 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4858 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9957, Adjusted R-squared: 0.9957
## F-statistic: 1.24e+05 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitETS.mn))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) 0.1069 0.326443 3.264e-03 3.264e-03
## USDJPY.Close 0.9991 0.002838 2.838e-05 2.838e-05
## sigma2 0.2369 0.014676 1.468e-04 1.468e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) -0.5333 -0.1127 0.1096 0.3260 0.7492
## USDJPY.Close 0.9935 0.9972 0.9991 1.0010 1.0046
## sigma2 0.2096 0.2267 0.2364 0.2467 0.2671
## lowest price fit data
summary(lm(Point.Forecast~ USDJPY.Close, data = fitETS.lo))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitETS.lo)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.1318 -0.2450 0.0860 0.3331 1.4818
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.3885 0.3684 -3.769 0.000182 ***
## USDJPY.Close 1.0083 0.0032 315.094 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5478 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9947, Adjusted R-squared: 0.9947
## F-statistic: 9.928e+04 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitETS.lo))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) -1.3881 0.368114 3.681e-03 3.681e-03
## USDJPY.Close 1.0082 0.003201 3.201e-05 3.201e-05
## sigma2 0.3013 0.018663 1.866e-04 1.866e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) -2.1101 -1.6358 -1.3851 -1.1410 -0.6638
## USDJPY.Close 1.0020 1.0061 1.0082 1.0104 1.0145
## sigma2 0.2666 0.2883 0.3006 0.3137 0.3397
## closed price fit data
summary(lm(Point.Forecast~ USDJPY.Close, data = fitETS.cl))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitETS.cl)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.4339 -0.4026 -0.0249 0.3998 3.4032
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.17826 0.49050 0.363 0.716
## USDJPY.Close 0.99873 0.00426 234.437 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7293 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9904, Adjusted R-squared: 0.9904
## F-statistic: 5.496e+04 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitETS.cl))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) 0.1787 0.490086 4.901e-03 4.901e-03
## USDJPY.Close 0.9987 0.004261 4.261e-05 4.261e-05
## sigma2 0.5340 0.033079 3.308e-04 3.308e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) -0.7825 -0.1511 0.1827 0.5077 1.1430
## USDJPY.Close 0.9904 0.9959 0.9987 1.0016 1.0071
## sigma2 0.4725 0.5110 0.5327 0.5559 0.6021
Basically for volatility analyzing, we can using RSY Volatility mesure, kindly refer to Analyzing Financial Data and Implementing Financial Models using R4 paper 22 for more information. Well, Garch model is designate for forecast volatility.
Now we look at Garch model, Figlewski (2004)5Paper 19th applied few models and also using different length of data for comparison. Now I use daily Hi-Lo and 365 days data in order to predict the next market price. The author applid Garch on SAP200, 10-years-bond and 20-years-bond and concludes that the Garch model is better than eGarch but implied volatility model better than Garch and eGarch, and the monthly Hi-Lo data is better accurate than daily Hi-Lo for long term investment.
Basic Introduction to GARCH and EGARCH (Part 3)6 Using this EGARCH model, we can epect a better estimate the volatility for assets returns due to how the EGARCH counteracts the limitations on the classic GARCH model.Here is the final part of the series of posts on the volatility modelling where I will briefly talk about one of the many variant of the GARCH model: the exponential GARCH (abbreviated EGARCH). I chose this variant because it improves the GARCH model and better model some market mechanics. In the GARCH post, I didn’t mention any of the limitation of the model as I kept them for today’s post. First of all, the GARCH model assume that only the magnitude of unanticipated excess returns determines \(\sigma^2_t\). Intuitively, we can question this assumption; I, for one, would argue that not only the magnitude but also the direction of the returns affects volatility.
Firstly we use rugarchand then rmgarch8 Due to file loading heavily, here I leave the multivariate Garch models for future works. to compare the result.
##
## *---------------------------------*
## * GARCH Model Spec *
## *---------------------------------*
##
## Conditional Variance Dynamics
## ------------------------------------
## GARCH Model : sGARCH(1,1)
## Variance Targeting : FALSE
##
## Conditional Mean Dynamics
## ------------------------------------
## Mean Model : ARFIMA(1,0,1)
## Include Mean : TRUE
## GARCH-in-Mean : FALSE
##
## Conditional Distribution
## ------------------------------------
## Distribution : norm
## Includes Skew : FALSE
## Includes Shape : FALSE
## Includes Lambda : FALSE
## This defines a basic ARMA(1,1)-GARCH(1,1) model, though there are many more options to choose from ranging from the type of GARCH model, the ARFIMAX-arch-in-mean specification and conditional distribution. In fact, and considering only the (1,1) order for the GARCH and ARMA models, there are 13440 possible combinations of models and model options to choose from:
## possible Garch models.
nrow(expand.grid(GARCH = 1:14, VEX = 0:1, VT = 0:1, Mean = 0:1, ARCHM = 0:2, ARFIMA = 0:1, MEX = 0:1, DISTR = 1:10))
## Here I test the accuracy of forecasting of univariate Garch ('sGarch' model) models.
## Test the models
## opened price fit data
summary(lm(Point.Forecast~ USDJPY.Close, data = fitGM.op))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitGM.op)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.6209 -0.4287 -0.0281 0.4425 3.6843
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.257395 0.504495 -0.51 0.61
## USDJPY.Close 1.002349 0.004382 228.76 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7502 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9899, Adjusted R-squared: 0.9899
## F-statistic: 5.233e+04 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitGM.op))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) -0.2569 0.504069 5.041e-03 5.041e-03
## USDJPY.Close 1.0023 0.004383 4.383e-05 4.383e-05
## sigma2 0.5649 0.034994 3.499e-04 3.499e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) -1.2456 -0.5961 -0.2528 0.08144 0.7349
## USDJPY.Close 0.9938 0.9994 1.0023 1.00533 1.0109
## sigma2 0.4999 0.5406 0.5636 0.58812 0.6369
## highest price fit data
summary(lm(Point.Forecast~ USDJPY.Close, data = fitGM.hi))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitGM.hi)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.4099 -0.3440 -0.1103 0.2854 3.6113
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.456394 0.399124 1.143 0.253
## USDJPY.Close 0.999852 0.003466 288.435 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5935 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9936, Adjusted R-squared: 0.9936
## F-statistic: 8.319e+04 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitGM.hi))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) 0.4568 0.398787 3.988e-03 3.988e-03
## USDJPY.Close 0.9998 0.003467 3.467e-05 3.467e-05
## sigma2 0.3536 0.021902 2.190e-04 2.190e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) -0.3254 0.1884 0.4600 0.7245 1.2414
## USDJPY.Close 0.9931 0.9975 0.9998 1.0022 1.0066
## sigma2 0.3129 0.3384 0.3527 0.3681 0.3986
## mean price fit data (mean price of daily highest and lowest price)
summary(lm(Point.Forecast~ USDJPY.Close, data = fitGM.mn))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitGM.mn)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.79126 -0.26821 -0.01816 0.25463 1.73627
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.610431 0.333376 -1.831 0.0677 .
## USDJPY.Close 1.005115 0.002895 347.137 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4957 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9956, Adjusted R-squared: 0.9956
## F-statistic: 1.205e+05 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitGM.mn))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) -0.6101 0.333096 3.331e-03 3.331e-03
## USDJPY.Close 1.0051 0.002896 2.896e-05 2.896e-05
## sigma2 0.2467 0.015281 1.528e-04 1.528e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) -1.2634 -0.8343 -0.6074 -0.3865 0.04526
## USDJPY.Close 0.9994 1.0032 1.0051 1.0071 1.01078
## sigma2 0.2183 0.2361 0.2461 0.2568 0.27813
## lowest price fit data
summary(lm(Point.Forecast~ USDJPY.Close, data = fitGM.lo))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitGM.lo)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0822 -0.2652 0.1064 0.3345 1.6406
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.345448 0.389955 -6.015 3.35e-09 ***
## USDJPY.Close 1.016070 0.003387 300.005 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5798 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9941, Adjusted R-squared: 0.9941
## F-statistic: 9e+04 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitGM.lo))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) -2.3451 0.389626 3.896e-03 3.896e-03
## USDJPY.Close 1.0161 0.003388 3.388e-05 3.388e-05
## sigma2 0.3375 0.020908 2.091e-04 2.091e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) -3.1093 -2.607 -2.3419 -2.0835 -1.5785
## USDJPY.Close 1.0094 1.014 1.0160 1.0184 1.0227
## sigma2 0.2986 0.323 0.3367 0.3514 0.3805
## closed price fit data
summary(lm(Point.Forecast~ USDJPY.Close, data = fitGM.cl))
##
## Call:
## lm(formula = Point.Forecast ~ USDJPY.Close, data = fitGM.cl)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.4394 -0.3998 -0.0405 0.4134 3.7289
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.222628 0.500474 -0.445 0.657
## USDJPY.Close 1.002070 0.004347 230.534 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7442 on 533 degrees of freedom
## (216 observations deleted due to missingness)
## Multiple R-squared: 0.9901, Adjusted R-squared: 0.9901
## F-statistic: 5.315e+04 on 1 and 533 DF, p-value: < 2.2e-16
summary(MCMCregress(Point.Forecast~ USDJPY.Close, data = fitGM.cl))
##
## Iterations = 1001:11000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## (Intercept) -0.2222 0.500052 5.001e-03 5.001e-03
## USDJPY.Close 1.0021 0.004348 4.348e-05 4.348e-05
## sigma2 0.5560 0.034438 3.444e-04 3.444e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## (Intercept) -1.2029 -0.5587 -0.2181 0.1135 0.7617
## USDJPY.Close 0.9935 0.9992 1.0020 1.0050 1.0106
## sigma2 0.4919 0.5320 0.5546 0.5788 0.6268
Staking function. Here I apply Kelly criterion as the betting strategy. I don’t pretend to know the order of price flutuation flow from the Hi-Lo price range, therefore I just using Closing price for settlement while the staking price restricted within the variance (Hi-Lo) to made the transaction stand. The settled price can only be closing price unless staking price is opening price which sellable within the Hi-Lo range.
Due to we cannot know the forecasted sell/buy price and also forecasted closing price which is coming first solely from Hi-Lo data, therefore the Profit&Loss will slidely different (sell/buy price = forecasted sell/buy price).
Forecasted profit = edge based on forecasted sell/buy price - forecasted settled price.
If the forecasted sell/buy price doesn’t exist within the Hi-Lo price, then the transaction is not stand.
If the forecasted settled price does not exist within the Hi-Lo price, then the settled price will be the real closing price.
Kindly refer to Quintuitive ARMA Models for Trading to know how to determine PULL or CALL with ARMA models9 The author compare the ROI between Buy-and-Hold with GARCH model..
Here I set an application of leverage while it is very risky (the variance of ROI is very high) as we can know from later comparison.
Staking Model
For Buy-Low-Sell-High tactic, I placed two limit order for tomorrow now, which are buy and sell. The transaction will be standed once the price hit in tomorrow. If the buy price doesn’t met, there will be no transaction made, while sell price doesn’t occur will use closing price for settlement.10 Using Kelly criterion staking model
For variance betting, I used both focasted highest minus the forecasted lowest price to get the range. After that placed two limit orders as well. If one among the buy or sell price doesn’t appear will use closing price as final settlement.11 Place $100 for every single bet.
2.1.4.2 Garch vs EWMA
The staking models same with what I applied onto ETS modelled dataset.
2.1.4.3 MCMC vs Bayesian Time Series
2.1.4.4 MIDAS
2.1.5 Return of Investment
2.1.5.1 ARIMA vs ETS
.id
StartDate
LatestDate
InitFund
LatestFund
Profit
RR
fundAutoArimaCLCL
2015-01-02
2017-01-20
1000
1000.000
0.00000
1.000000
fundAutoArimaCLHI
2015-01-02
2017-01-20
1000
1323.688
323.68809
1.323688
fundAutoArimaCLLO
2015-01-02
2017-01-20
1000
1261.157
261.15684
1.261157
fundAutoArimaCLMN
2015-01-02
2017-01-20
1000
1292.947
292.94694
1.292947
fundAutoArimaHICL
2015-01-02
2017-01-20
1000
1401.694
401.69378
1.401694
fundAutoArimaHIHI
2015-01-02
2017-01-20
1000
1000.000
0.00000
1.000000
fundAutoArimaHILO
2015-01-02
2017-01-20
1000
1637.251
637.25113
1.637251
fundAutoArimaHIMN
2015-01-02
2017-01-20
1000
1363.714
363.71443
1.363714
fundAutoArimaLOCL
2015-01-02
2017-01-20
1000
1499.818
499.81773
1.499818
fundAutoArimaLOHI
2015-01-02
2017-01-20
1000
1716.985
716.98492
1.716985
fundAutoArimaLOLO
2015-01-02
2017-01-20
1000
1000.000
0.00000
1.000000
fundAutoArimaLOMN
2015-01-02
2017-01-20
1000
1440.170
440.16965
1.440170
fundAutoArimaMNCL
2015-01-02
2017-01-20
1000
1158.790
158.79028
1.158790
fundAutoArimaMNHI
2015-01-02
2017-01-20
1000
1236.199
236.19900
1.236199
fundAutoArimaMNLO
2015-01-02
2017-01-20
1000
1250.375
250.37547
1.250376
fundAutoArimaMNMN
2015-01-02
2017-01-20
1000
1000.000
0.00000
1.000000
fundAutoArimaOPCL
2015-01-02
2017-01-20
1000
1047.563
47.56281
1.047563
fundAutoArimaOPHI
2015-01-02
2017-01-20
1000
1325.983
325.98313
1.325983
fundAutoArimaOPLO
2015-01-02
2017-01-20
1000
1307.610
307.60951
1.307610
fundAutoArimaOPMN
2015-01-02
2017-01-20
1000
1304.819
304.81916
1.304819
The return of investment from best fitted Auto Arima model.
From above table summary we can know that model 1 without any leverage will be growth with a stable pace where LoHi and LoHi generates highest return rates. fundLOHI indicates investment fund buy at LOwest price and sell at HIghest price and vice verse.
From above table summary we can know that model 1 without any leverage will be growth with a stable pace where LoHi and LoHi generates highest return rates. fundLOHI indicates investment fund buy at LOwest price and sell at HIghest price and vice verse.
In order to trace the errors, here I check the source codes of the function but also test the coding as you can know via Error : Forbidden model combination #554. Here I only take 22 models among 48 models.
## load the pre-run and saved models.
## Profit and Loss of multi-ets models. 22 models.
## Due to the file name contains 'MNM' is not found in directory but appear in dir(), Here I force to omit it...
#' @> sapply(ets.m, function(x) {
#' @ dir('data', pattern = x) %>% length
#' @ }, USE.NAMES = TRUE) %>% .[. > 0]
#ANN MNN ZNN AAN MAN ZAN MMN ZMN AZN MZN ZZN MNM ANZ MNZ ZNZ AAZ MAZ ZAZ MMZ ZMZ AZZ MZZ ZZZ
# 25 25 25 25 25 25 25 25 25 25 25 1 25 25 25 25 25 25 25 25 25 25 25
nms <- sapply(ets.m, function(x) {
dir('data', pattern = x) %>% length
}, USE.NAMES = TRUE) %>% .[. == 25] %>% names #here I use only [. == 25].
#'@ nms <- sapply(ets.m, function(x) {
#'@ dir('data', pattern = x) %>% length
#'@ }, USE.NAMES = TRUE) %>% .[. > 0] %>% names #here original [. > 0].
fls <- sapply(nms, function(x) {
sapply(pp, function(y) {
dir('data', pattern = paste0(x, '.', y[1], y[2]))
})
})
## From 22 ets models with 25 hilo, opcl, mnmn, opop etc different price data. There will be 550 models.
fundList <- llply(fls, function(dt) {
cbind(Model = str_replace_all(dt, '.rds', ''),
readRDS(file = paste0('./data/', dt))) %>% tbl_df
})
names(fundList) <- sapply(fundList, function(x) xts::first(x$Model))
## Summary of ROI
ets.tbl <- ldply(fundList, function(x) { x %>% mutate(StartDate = xts::first(Date), LatestDate = last(Date), InitFund = xts::first(BR), LatestFund = last(Bal), Profit = sum(Profit), RR = LatestFund/InitFund) %>% dplyr::select(StartDate, LatestDate, InitFund, LatestFund, Profit, RR) %>% unique }) %>% tbl_df
From above table, we find the ets model AZN and AZZ generates highest return compare to rest of 21 ets models.
Figlewski (2004) applied few models and also using different length of data for comparison. Now I use daily Hi-Lo and 365 days data in order to predict the next market price. Since I only predict 2 years investment therefore a further research works on the data sizing and longer prediction terms need (for example: 1 month, 3 months, 6 months data to predict coming price, 2ndly comparison of the ROI from 7 years or upper).
Variance/Volatility Analsis
Hereby, I try to place bets on the variance which is requested by the assessment. Firstly we look at Auto Arima model.
## load the pre-run and saved models.
## Profit and Loss of Arima models.
fundList <- llply(flsAutoArima, function(dt) {
cbind(Model = str_replace_all(dt, '.rds', ''),
readRDS(file = paste0('./data/', dt))) %>% tbl_df
})
names(fundList) <- sapply(fundList, function(x) xts::first(x$Model))
From above coding and below graph, we can know my first staking method12 The variance range is solely based on forecasted figures irrespect the volatility of real time effect, only made settlement after closed market. After that use the daily Hi-Lo variance compare to initial forecasted variance. Even though there has no such highest price nor lowest price will not affect the predicted transaction. which is NOT EXCEED the daily Hi-Lo range will generates profit or ruined depends on the statistical models.
The 2nd staking method is based on real-time volativity which is the transaction will only stand if the highest or lowest price happenned within hte variance, same with the initial Kelly staking model. The closing Price will be Highest or Lowest price if one among the price doesn’t exist within the range of variance.
It doesn’t work since the closed priceMUSTbe between highest and lowest price. Here I stop it and set as eval = FALSE for display purpose but not execute
2.1.6.2 Garch vs EWMA
As I mentioned in first section which is the combination models will be more than 10,000, therefore I try to refer to acf() and pacf() to determine the best fit value p and q for ARMA model. You can refer to below articles for more information.
时间序列分析之ARIMA模型预测__R篇13 Due to this article compare the combination models with acf and pacf and eventually get that acf and pacf produce a better fit model.
8.7 ARIMA modelling in R14 (a) The best model (with smallest AICc) is selected from the following four:15 ARIMA(2,d,2),16 ARIMA(0,d,0),17 ARIMA(1,d,0),18 ARIMA(0,d,1).
Here I use a function to find the optimal value of p and q from armaOrder(0,0) to armaOrder(5,5) by refer to R-ARMA(p,q)如何选找最小AIC的p,q值.
However, due to optimal r and s for Garch model will consume alot of time to test the optimal garchOrder(r,s) here I skip it and just using default garchOrder(1,1).
An ARMA(p,q) model specifies the conditional mean of the process
The GARCH(r,s) model specifies the conditional variance of the process
In this thesis we have studied the DCC-GARCH model with Gaussian, Student’s t and skew Student’s t-distributed errors. For a basic understanding of the GARCH model, the univariate GARCH and multivariate GARCH models in general were discussed before the DCC-GARCH model was considered…
After precenting the theory, DCC-GARCH models were fit to a portfolio consisting of European, American and Japanese stocks assuming three different error distributions; multivariate Gaussian, Student’s t and skew Student’s t. The European, American and Japanese series seemed to have a bit different marginal distributions.The DCC-GARCH model with skew Student’s t-distributed errors performed best.But even the DCC-GARCH with skew Student’s t-distributed errors did explain all of the asymmetry in the asset series. Hence even better models may be considered. Comparing the DCC-GARCH model with the CCC-GARCH model using the Kupiec test showed that the first model gave a better fit to the data.
There are several possible directions for future work. It might be better to use other marginal models such as the EGARCH, QGARCH and GJR GARCH, that capture the asymmetry in the conditional variances. If the univariate GARCH models are more correct, the DCC-GARCH model might yield better results. Other error distributions, such as a Normal Inverse Gaussian (NIG) might also give a better fit. When we fitted the Gaussian, Student’s t- and skew Student’s t-distibutions to the data, we assumed all the distributions to be the same for the three series. This might be a too restrictive criteria. A model where the marginal distributions is allowed to be different for each of the asset series might give a better fit. One then might use a Copula to link the marginals together.
From above comparison of distribution used, we know that snorm distribution generates most return (notes : I should use LoHi instead of MnCl since it will generates highest ROI, however most of LoHi Garch models facing large data size or large NA values error. Here I skip the LoHi data for comparison). Now we know the best p and q, solver using hybrid, and best fitted distribution. Now we try to compare the Garch models.
For dcc-Garch models which are multivariate Garch models will be future works :
\(\Re\) is the edge or so call advantage for an investment. The \(\rho_i^{EM}\) is the estimated probabilities which is the calculated by firm A from match 1,2… until \(n\) matches while \(\rho_{i}^{BK}\) is the net/pure probability (real odds) offer by bookmakers after we fit the equation 4.1.2 into equation 4.1.1.
\(P_i^{Back}\) and \(P_i^{Lay}\) is the backed and layed fair price offer by bookmakers.
We can simply apply equation above to get the value \(\Re\). From the table above we know that the EMPrice calculated by firm A invested at a threshold edge (price greater) 2.20%, 3.19%, 0.64%, 3.87%, 0.54% than the prices offer by bookmakers. There are some description about \(\Re\) on Dixon and Coles (1996)4 Kindly refer to 25th paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References. The optimal value of \(\rho_{i}\) (rEMProbB) will be calculated based on bootstrapping/resampling method in section 4.3 Kelly Ⓜodel.
Now we look at the result of the soccer matches prior to filter out for further modelling from this section.
Profit and Loss of Investment
Stakes and Profit and Loss of Firm A at Agency A (2011~2015) ($0,000)
table 4.1.2 : 7 x 8 : Summary of betting results.
The table above summarize the stakes and return on soccer matches result. Well, below table list the handicaps placed by firm A on agency A. Due to the Cancelled result is null observation in descrete data modelling and we cannot be count into our model. Here Ifilter out the particular observation from the data from here and now the total observation of the dataset became 41055.
CORRECTION : need to keep the cancelled matches as the “push” to count in the probability of the cancellation betslip as well which is occurred in real life.
table 4.1.3 : 41055 x 66 : Odds price and probabilities sample table.
Above table list a part of sample odds prices and probabilities of soccer match \(i\) while \(n\) indicates the number of soccer matches. We can know the values rEMProbB, netProbB and so forth.
graph 4.1.1 : A sample graph about the relationship between the investmental probabilities -vs- bookmakers’ probabilities.
Graph above shows the probabilities calculated by firm A to back against real probabilities offered by bookmakers over 41055 soccer matches.
I list the handicap below prior to test the coefficient according to the handicap in next section 4.2 Linear Ⓜodel.
table 4.1.4 : 8 x 6 : The handicap in sample data.
4.2 Linear Ⓜodel
From our understanding of staking, the covariates we need to consider should be only odds price since the handicap’s covariate has settled according to different handicap of EMOdds.
Again, I don’t pretend to know the correct Ⓜodel, here I simply apply linear model to retrieve the value of EMOdds derived from stakes. The purpose of measure the edge overcame bookmakers’ vigorish is to know the levarage of the staking activities onto 1 unit edge of odds price by firm A to agency A. By refer to figure 4.4.1, I includes the models which split the pre-match and in-play ito comparison.
When I used to work in 188Bet and Singbet as well as AS3388, we know from the experience which is the odds price of favorite team win will be the standard reference and the draw odds will adjust a little bit while the underdog team will be ignore.
Steven Xu (2013)5 Kindly refer to 16th paper in Reference for industry knowdelege and academic research portion for the paper. has do a case study on the comparison of the efficiency of opening and closing price of NFL and College American Football Leagues and get to know the closing price is more efficient and accurate compare to opening price nowadays compare to years 1980~1990. It might be due to multi-million dollars of stakes from informed traders or smart punters to tune up the closing price to be likelihood.
In order to test the empirical clichés, I used to conduct a research thoroughly through ®γσ, Eng Lian Hu (2016)6 Kindly refer to 3rd paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References, I completed the research on year 2010 but write the thesis in year 2016. and concludes that the opening price of Asian Handicap and also Goal Lines of 29 bookmakers are efficient than mine. However in my later ®γσ, Eng Lian Hu (2014)7 Kindly refer to 4th paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References applied Kelly staking model where made a return of more than 30% per sesson. Meanwhile, the Dixon and Coles (1996) and Crowder, Dixon, Ledford and Robinson (2001)8 Kindly refer to 27th paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References has built two models which compare the accuracy of home win, draw and away win. From a normal Poison model reported the home win is more accurate and therefore an add-hoc inflated parameter required in order to increase the accuracy of prediction. You are feel free to learn about the Dixon and Coles (1996) in section 4.4 Poisson Ⓜodel.
shinyapp 4.2.1 : WDW-AH convertion and summary and anova of linear models. Kindly click on regressionApps10 You might select Y response variable and X explanatory variable(s) to measure your model (Refer to Shiny height-weight example for further information about shinyapp for linear models.) or existing models. to use the ShinyApp.
Here I simply attached with a Fixed Odds to Asian Handicap’s calculator which refer to my ex-colleague William Chen’s11 My ex-colleague and best friend in sportsbook industry which known since join sportsbook industry year 2005 —— Telebiz and later Caspo Inc. spreadsheet version 1.1 in year 2006. You can simply input the home win, draw, away win (in decimal format) as well as the overround to get the conversion result from the simple an basic equation.12 Kindly refer to my previous research to know the vigorish / overround.
From the summary of shinyapp 4.2.1, we know the comparison among the models to get the best fitted model.
table 4.2.1 : Application of linear regression models to test the effects on staking.
table 4.2.2A : Best model to test the effects of staking on all soccer matches (includes both pre-match and in-play).
table 4.2.2B : Best model to test the effects of staking on pre-match soccer matches.
table 4.2.2C : Best model to test the effects of staking on in-play soccer matches.
table 4.2.3 : Best model to test the effects of staking soccer matches.
Base on above few tables and also summarised table 4.2.3, we can compare both lm0 and lm0ip + lm0pm and decide that the model lm0ip + lm0pm13 BIC will be primary reference while AIC is the secondary reference. The smallest value is the best model. all = 446,424.83 and mixed = 444,750.45 is the best fit to determine the factors and effects to place stakes for all matches14 mixed InPlay + Pre-match, all observations are 41055 soccer matches which has placed bets.. The timing of InPlay and the stakes amount is the major effects to the return of investment.
John Fingleton & Patrick Waldron (1999) apply Shin’s model and finally conclude suggests that bookmakers in Ireland are infinitely risk-averse and balance their books. The authors cannot distinguish between inside information and operating costs, merely concluding that combined they account for up to 3.7% of turnover while normally Asian bookmakers made less than 1% and a anonymous company has made around 2%. However the revenue or the stakes are farly more than European bookmakers.15 You can refer to my another project Analyse the Finance and Stocks Price of Bookmakers which analysis the financial report of public listed companies and also profitable products’ revenue and profit & loss of anonymous company..
They compare different versions of our model, using data from races in Ireland in 1993. The authors’ empirical results can be summarised as follows:
They reject the hypothesis that bookmakers behave in a risk neutral manner;
They cannot reject the hypothesis that they are infinitely riskaverse;
They estimate gross margins to be up to 4 per cent of total oncourse turnover; and
They estimate that 3.1 to 3.7% (by value) of all bets are placed by punters with inside information.
figure 4.2.1 : Chance of Winning.
Due to the Shin model inside the paper research for the sake of bookmakers and this sportsbook consultancy firm is indeed the informed trading (means smart punters or actuarial hedge fund but not ordinary gambler place bets with luck). Here I think of test our previous data in paper ®γσ, Eng Lian Hu (2016)16 Kindly refer to 3rd paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References which collect the dataset of opening and also closing odds price of 40 bookmakers and 29 among them with Asian Handicap and Goal Line. Meanwhile, there has another research on smart punters (Punters Account Review (Agenda).xlsx) which make million dollars profit from Ladbrokes. You are feel free to browse over the dataset for the paper. and also the anonymous companies’s revenue and P&L to analyse the portion of smart punters among the customers in Analyse the Finance and Stocks Price of Bookmakers. However the betslip of every single bet require to analyse it. The sparkR amd RHadoop as well as noSQL require in order to analyse the multiple millions bets. It is interesting to analyse the threaten of hedge fund17 Kindly refer to 富传奇色彩的博彩狙击公司EM2 to know the history and the threaten of EM2 sportsbook consultancy company to World wide known bankers. since there has a anonymous brand among the brands under Caspo Inc had closed due to a lot of smart punters’ stakes and made loss. Well, here I leave it for future research18 Here I put in 6.2 Future Works. if the dataset is available.
diagram 4.3.0 get the idea from above paper which is count the odds price offers by bookmakers into calculation. My previous Odds Modelling and Testing Inefficiency of Sports Bookmakers odds modelling will be conducted further enhanced beyond next few years. In this staking model, I will also use the idea to measure the weakness of bookmakers but also enhanced our staking strategics. Meanwhile, Application of Kelly Criterion model in Sportsbook Investment22 We can know from Part I where we can easily make profit from bookmakers but the Part II will enhanced to increase the profit and also the money management. will use a basket of bookmakers’ odds price to simulate it.
video 4.3.1.1 : Using Kelly Criterion for Trade Sizing
video 4.3.1.2 : Option Trading - The Kelly criterion formula: Mazimize your growth rate & account utility
video 4.3.1.3 : The Kelly criterion for trading options
To achieve the level of profitable betting, one must develop a correct money management procedure. The aim for a punter is to maximize the winnings and minimize the losses. If the punter is capable of predicting accurate probabilities for each match, the Edward O. Thorp (2006)23 Kindly refer to 6th paper in Reference for industry knowdelege and academic research portion for the paper. in 7.4 References has proven to work effectively in betting. It was named after an American economist John Kelly (1956)24 Kindly refer to 26th paper in Reference for industry knowdelege and academic research portion for the paper. in 7.4 References and originally designed for information transmission. The Kelly criterion is described below:
Where \(S\) is the stake expressed as a fraction of one’s total bankroll.
\(\rho_{EM}\) is probability of an event to take place and
while \(BK_{Decimal\ odds}\) is decimal odds (decimal odds the return rates with capital stakes) and \(BK_{HK\ Odds}\) (HK odds is the net profit rates without capital stakes) for an event offered by the bookmaker.
Due to HK odds or decimal odds start from range \((0,\infty]\) and return will be \([0,\infty]\), therefore logarithmic function required. For Malay odds \([-1,1]\) no need logarithm. Here I switch from equation 4.3.1.1 to equation 4.3.1.2 as below.
It maximizes the asymptotic growth rate of capital
Asymptotically, it minimizes the expected time to reach a specified goal
It outperforms in the long run any other essentially different strategy almost surely
figure 4.3.1.2 : Example of application Kelly criterion.
The criterion is known to economists and financial theorists by names such as the geometric mean maximizing portfolio strategy, the growth-optimal strategy, the capital growth criterion, etc. We will now show that Kelly betting will maximize the expected log utility for sports-book betting.
table 4.3.1.1 : 5 x 5 : Return of annually investment summary table without cancelled bets.26 the rRates is the mean value of annual return rates which is the return divides by stakes but ommit the cancelled/voided bets to avoind the bias.
The rRates value from table above excludes the Cancelled bets. By refer to equation 4.3.1.2, now we fit the adge value from equation 4.1.1 into it to get the rEMProbB2 and rEMProbL2 with known staked value \(S\)27 Although the result will not be accurate due to the we mention at first, the firm A will not only place bets via only agent A. Let say edge of 0.10 and 0.20 also placed maximum bet HKD40000 but the firm A might placed different amount through other agency based on different edge. However based on the stakes we can reverse the optimal EM Odds. to replace the existing EM value. 28 Initially think of linear modelling and get the mean value, the positive standard deviation value will be counted as edge range and the residuals value will be the different within the stakes across the leagues. It will similar with proportional staking model as states in paper Good and bad properties of the Kelly criterionby MacLean, Thorp and Ziemba (2010) and concludes that the Full-Kelly model is the best model for long run, you can refer to the reference in Kelly Criterion - Part II for further understanding.
Although the Kelly model is very simple, but we need to seperates the staking based on different leagues or time range to make it applicable to real world. I don’t pretend to know the correct model again but guess the applicable model by testing few models and choose the best among them.
We try to apply the equation 4.3.1.3 to get the Kelly stakes for every single soccer match.
Due to there have few reference papers conducting few staking strategics and concludes full Kelly model is the best fit and most profitable along the long term investment, here I try to simuulate the half-Kelly and also quadruple-Kelly, as well as double-Kelly staking model etc and get the optimal weighted control parameter.29 There has a reference paper in section 2 of Application of Kelly Criterion model in Sportsbook Investment has compare few models as well and also provides the pro-and-con of Kelly model in investment. However, Kelly model will be the best across the long term investment. Besides, there have few papers doing research and also critic on the Kelly model in investment in financial market and also betting market (includes the rebates of the credit market as well), PIMCO’s fund manager Bill Gross who manage more than one trillion USD funds applied Kelly model for portfolio, George Soros and Warren Buffet also applied similar theoty or method with Kelly although there has no evidence to proof it. You are feel free to know in later section 4.5 Staking Ⓜodel and Ⓜoney Ⓜanagement. For further details kindly refer to Application of Kelly Criterion model in Sportsbook Investment.
Fractional Kelly models are the weight function for Kelly criterion When we talk about weight function in Kelly model. A Response to Professor Paul A Samuelson’s Objections to Kelly Capital Growth Investing has talk about the investment portfolio and compare the double-Kelly, full-Kelly, half-Kelly, quadruple-Kelly and also proportional betting across different stages of iterations and concludes that the full-Kelly will be the best fit and growth beyond the ages. Well, fractional-Kelly (means double-Kelly, half-Kelly and quadruple-Kelly but not full-Kelly model) models will be elastics and lesser will be more conservative and double-Kelly will be very risky and eventually going to bankcrupt due to the staking leverages ratio is twice of full-Kelly and over the sustainability of capital. and For further details kindly refer to Application of Kelly Criterion model in Sportsbook Investment. Therefore in last basic Kelly we use the full-Kelly within same leagues but due to there has different levels of risk setting across different soccer leagues. Therefore a weight function needed to make the staking strategy flexible, and it is term as Kelly portfolio to diversified the investment.
4.3.2 Fractional Kelly Ⓜodel
Now we try to fit a weight function into basic Kelly model to be fractional Kelly model. I try to use log to test the maximum value of weight parameter. You can just simply use \(w = \frac{1}{4}\) or \(log(w) = \frac{1}{2}\) while \(w\) is a vector. Please be mind that the value greater than 1 will be risky since involve leverage and lesser will be more conservative.
From Niko Marttinen (2001), we can know the full-Kelly generates couple times profit compare to fractional Kelly-models. However there has two points need to be enhanced.
The high risk at the beginning period of investment.
Test on different level of edge and concludes that the 145% generates the highest return.
Below Fabián Enrique Moya (2012) also test the fractional Kelly models with diversify money management methods.
paper 4.3.2.1 : Fabián Enrique Moya (2012)
League Stakes Profiling
Stakes of Firm A at Agency A (2011~2015) ($0,000)
table 4.3.2.1 : 177 x 6 : League stakes profiling of firm A year 2011~2015.
Above league risk profile suppose to stores the maximum bet for every single league but I only randomly select 6 leagues as sample. However due to I’ve not yet write a function for real time API30 There are a lot of real time XML odds price and staking softwares similar with 4lowin2 which was states at the begining section in Part I with operators and test the maximum stakes per bet therefore here I reverse the mean value as the baseline stakes for every single league with a certain range of standard deviation for resampling simulation in later section.
Stakes based reversed Kelly models
Basic Fractional Models
Stakes based reversed Kelly models are the application of the parameter from reversion of the stakes where add-on some modified version Kelly models. I tried to adjust the stakes to get the outcome of PL result.
Table 4.3.2.2A : Summary Table of Various Kelly Models (Stakes reversed based models)
TimeUS
DateUS
Sess
League
Stakes
HCap
HKPrice
EUPrice
Result
Return
PL
PL.R
Rebates
RebatesS
rRates
netEMEdge
netProbB
netProbL
rEMProbB
rEMProbL
weight.stakes
weight
PropHKPriceEdge
PropnetProbBEdge
KProbHKPrice
KProbnetProbB
KProbFixed
KProbFixednetProbB
KEMProb
KEMProbnetProbB
KProbHalf
KProbHalfnetProbB
KProbQuarter
KProbQuarternetProbB
KProbAdj
KProbAdjnetProbB
KHalfAdj
KHalfAdjnetProbB
KEMQuarterAdj
KEMQuarterAdjnetProbB
KStakesHKPriceEdge
KStakesnetProbBEdge
KStakesHKPrice
KStakesnetProbB
KStakesFixed
KStakesFixednetProbB
KStakesEMProb
KStakesEMProbnetProbB
KStakesHalf
KStakesHalfnetProbB
KStakesQuarter
KStakesQuarternetProbB
KStakesAdj
KStakesAdjnetProbB
KStakesHalfAdj
KStakesHalfAdjnetProbB
KStakesEMQuarterAdj
KStakesEMQuarterAdjnetProbB
KReturnHKPriceEdge
KReturnnetProbBEdge
KReturnHKPrice
KReturnnetProbB
KReturnFixed
KReturnFixednetProbB
KReturnEMProb
KReturnEMProbnetProbB
KReturnHalf
KReturnHalfnetProbB
KReturnQuarter
KReturnQuarternetProbB
KReturnAdj
KReturnAdjnetProbB
KReturnHalfAdj
KReturnHalfAdjnetProbB
KReturnEMQuarterAdj
KReturnEMQuarterAdjnetProbB
KPLHKPriceEdge
KPLnetProbBEdge
KPLHKPrice
KPLnetProbB
KPLFixed
KPLFixednetProbB
KPLEMProb
KPLEMProbnetProbB
KPLHalf
KPLHalfnetProbB
KPLQuarter
KPLQuarternetProbB
KPLAdj
KPLAdjnetProbB
KPLHalfAdj
KPLHalfAdjnetProbB
KPLEMQuarterAdj
KPLEMQuarterAdjnetProbB
KPLHKPriceEdge.R
KPLnetProbBEdge.R
KPLHKPrice.R
KPLnetProbB.R
KPLFixed.R
KPLFixednetProbB.R
KPLEMProb.R
KPLEMProbnetProbB.R
KPLHalf.R
KPLHalfnetProbB.R
KPLQuarter.R
KPLQuarternetProbB.R
KPLAdj.R
KPLAdjnetProbB.R
KPLHalfAdj.R
KPLHalfAdjnetProbB.R
KPLEMQuarterAdj.R
KPLEMQuarterAdjnetProbB.R
Min. :2011-01-07 14:45:00
Min. :2011-01-07
Min. :2011
ENG PR : 1930
Min. : 0.50
Min. :-3.500
Min. :0.1800
Min. :1.180
Cancelled: 28
Min. : 0.00
Min. :-1600.0000
Min. :-1.0000
Min. :-3.480000
Min. :-5568.000
Min. :2.002
Min. :2.002
Min. :0.0384
Min. :0.0947
Min. :0.07686
Min. :-0.81204
Min. :1
Min. :1
Min. : 1.004
Min. : 1.003
Min. : 0.7101
Min. : 0.656
Min. : 1.018
Min. : 0.8478
Min. :0.2658
Min. :0.1123
Min. : 0.5651
Min. : 0.4839
Min. : 0.4062
Min. : 0.2808
Min. : 1.018
Min. : 0.8478
Min. : 0.6814
Min. : 0.5262
Min. : 0.411
Min. : 0.2758
Min. : 2.80
Min. : 2.518
Min. : 0.50
Min. : 0.50
Min. : 0.50
Min. : 0.50
Min. :0.07686
Min. :0.07686
Min. : 0.000
Min. : 0.000
Min. : 0.0000
Min. : 0.0000
Min. : 0.50
Min. : 0.50
Min. : 0.04419
Min. : 0.0186
Min. : 0.001752
Min. : 0.00001
Min. : 0.0
Min. : 0.00
Min. : 0.00
Min. : 0.00
Min. : 0.00
Min. : 0.00
Min. :0.000
Min. :0.000
Min. : 0.000
Min. : 0.000
Min. : 0.000
Min. : 0.000
Min. : 0.00
Min. : 0.00
Min. : 0.000
Min. : 0.000
Min. : 0.0000
Min. : 0.0000
Min. :-6702.224
Min. :-6702.073
Min. :-1600.0000
Min. :-1600.0000
Min. :-1600.0000
Min. :-1600.0000
Min. :-1.62007
Min. :-1.62007
Min. :-399.5146
Min. :-399.5869
Min. :-99.2718
Min. :-99.38039
Min. :-1600.0000
Min. :-1600.0000
Min. :-204.07878
Min. :-225.6129
Min. :-26.03009
Min. :-31.81324
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.0000
Min. :-1.0000
Min. :-1.00
Min. :-1.00
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
1st Qu.:2012-09-07 15:00:00
1st Qu.:2012-09-07
1st Qu.:2012
FRA D2 : 1526
1st Qu.: 12.50
1st Qu.: 0.000
1st Qu.:0.7800
1st Qu.:1.780
Half Loss: 2798
1st Qu.: 0.00
1st Qu.: -18.0000
1st Qu.:-1.0000
1st Qu.:-0.040000
1st Qu.: -0.780
1st Qu.:2.017
1st Qu.:2.017
1st Qu.:0.4324
1st Qu.:0.4149
1st Qu.:0.87514
1st Qu.:-0.18917
1st Qu.:1
1st Qu.:1
1st Qu.: 12.587
1st Qu.: 12.730
1st Qu.: 6.4545
1st Qu.: 6.513
1st Qu.: 5.562
1st Qu.: 5.5095
1st Qu.:0.9349
1st Qu.:0.9288
1st Qu.: 3.4912
1st Qu.: 3.5105
1st Qu.: 2.0090
1st Qu.: 2.0105
1st Qu.: 5.562
1st Qu.: 5.5095
1st Qu.: 4.0210
1st Qu.: 4.0020
1st Qu.: 2.900
1st Qu.: 2.8978
1st Qu.: 52.86
1st Qu.: 52.636
1st Qu.: 12.50
1st Qu.: 12.50
1st Qu.: 12.50
1st Qu.: 12.50
1st Qu.:0.87514
1st Qu.:0.87514
1st Qu.: 2.574
1st Qu.: 2.575
1st Qu.: 0.0000
1st Qu.: 0.0000
1st Qu.: 12.50
1st Qu.: 12.50
1st Qu.: 1.42401
1st Qu.: 1.3993
1st Qu.: 0.162111
1st Qu.: 0.15220
1st Qu.: 0.0
1st Qu.: 0.00
1st Qu.: 0.00
1st Qu.: 0.00
1st Qu.: 0.00
1st Qu.: 0.00
1st Qu.:0.000
1st Qu.:0.000
1st Qu.: 0.000
1st Qu.: 0.000
1st Qu.: 0.000
1st Qu.: 0.000
1st Qu.: 0.00
1st Qu.: 0.00
1st Qu.: 0.000
1st Qu.: 0.000
1st Qu.: 0.0000
1st Qu.: 0.0000
1st Qu.: -74.240
1st Qu.: -74.218
1st Qu.: -18.0000
1st Qu.: -18.0000
1st Qu.: -18.0000
1st Qu.: -18.0000
1st Qu.:-0.86976
1st Qu.:-0.86976
1st Qu.: -3.9845
1st Qu.: -3.9845
1st Qu.: -0.4068
1st Qu.: -0.44459
1st Qu.: -18.0000
1st Qu.: -18.0000
1st Qu.: -2.09361
1st Qu.: -2.1492
1st Qu.: -0.24835
1st Qu.: -0.26395
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.0000
1st Qu.:-1.0000
1st Qu.:-1.00
1st Qu.:-1.00
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
Median :2013-09-21 10:00:00
Median :2013-09-21
Median :2013
GER D1 : 1464
Median : 26.00
Median : 0.750
Median :0.9300
Median :1.930
Half Win : 3052
Median : 16.00
Median : 0.0000
Median : 0.0000
Median : 0.000000
Median : 0.000
Median :2.044
Median :2.044
Median :0.5105
Median :0.4895
Median :1.03411
Median :-0.03411
Median :1
Median :1
Median : 26.148
Median : 26.495
Median : 13.1134
Median : 13.274
Median : 8.005
Median : 8.1081
Median :1.0165
Median :1.0169
Median : 6.8155
Median : 6.8944
Median : 3.6711
Median : 3.6931
Median : 8.005
Median : 8.1081
Median : 5.7454
Median : 5.7943
Median : 4.139
Median : 4.1400
Median : 110.34
Median : 110.275
Median : 26.00
Median : 26.00
Median : 26.00
Median : 26.00
Median :1.03411
Median :1.03411
Median : 6.045
Median : 6.048
Median : 0.8272
Median : 0.8439
Median : 26.00
Median : 26.00
Median : 3.05504
Median : 3.0992
Median : 0.359275
Median : 0.37422
Median : 68.3
Median : 68.34
Median : 16.00
Median : 16.00
Median : 16.00
Median : 16.00
Median :1.253
Median :1.253
Median : 3.242
Median : 3.216
Median : 0.000
Median : 0.000
Median : 16.00
Median : 16.00
Median : 1.788
Median : 1.708
Median : 0.1975
Median : 0.1781
Median : 0.000
Median : 0.000
Median : 0.0000
Median : 0.0000
Median : 0.0000
Median : 0.0000
Median : 0.00000
Median : 0.00000
Median : 0.0000
Median : 0.0000
Median : 0.0000
Median : 0.00000
Median : 0.0000
Median : 0.0000
Median : 0.00000
Median : 0.0000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.0000
Median : 0.0000
Median : 0.00
Median : 0.00
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Mean :2013-08-06 03:16:01
Mean :2013-08-05
Mean :2013
ITA D1 : 1412
Mean : 40.53
Mean : 1.075
Mean :0.9443
Mean :1.944
Loss :14374
Mean : 41.40
Mean : 0.8713
Mean : 0.0332
Mean : 0.001888
Mean : -0.123
Mean :2.033
Mean :2.033
Mean :0.5056
Mean :0.4944
Mean :1.02799
Mean :-0.02799
Mean :1
Mean :1
Mean : 39.896
Mean : 40.749
Mean : 19.8809
Mean : 20.292
Mean : 8.766
Mean : 9.1770
Mean :1.0017
Mean :0.9922
Mean : 10.2006
Mean : 10.3990
Mean : 5.3605
Mean : 5.4523
Mean : 8.766
Mean : 9.1770
Mean : 6.2743
Mean : 6.4657
Mean : 4.496
Mean : 4.5732
Mean : 168.78
Mean : 168.787
Mean : 40.53
Mean : 40.53
Mean : 40.53
Mean : 40.53
Mean :1.02799
Mean :1.02799
Mean : 9.582
Mean : 9.583
Mean : 1.8106
Mean : 1.8265
Mean : 40.53
Mean : 40.53
Mean : 4.73391
Mean : 4.8490
Mean : 0.562677
Mean : 0.61187
Mean : 172.5
Mean : 172.47
Mean : 41.40
Mean : 41.40
Mean : 41.40
Mean : 41.40
Mean :1.063
Mean :1.063
Mean : 9.781
Mean : 9.783
Mean : 1.842
Mean : 1.858
Mean : 41.40
Mean : 41.40
Mean : 4.833
Mean : 4.949
Mean : 0.5742
Mean : 0.6242
Mean : 3.684
Mean : 3.684
Mean : 0.8717
Mean : 0.8717
Mean : 0.8717
Mean : 0.8717
Mean : 0.03474
Mean : 0.03474
Mean : 0.1994
Mean : 0.1992
Mean : 0.0310
Mean : 0.03177
Mean : 0.8717
Mean : 0.8717
Mean : 0.09928
Mean : 0.1003
Mean : 0.01157
Mean : 0.01232
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.0328
Mean : 0.0329
Mean : 0.03
Mean : 0.03
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
3rd Qu.:2014-09-18 15:05:00
3rd Qu.:2014-09-18
3rd Qu.:2014
SPA D1 : 1331
3rd Qu.: 50.00
3rd Qu.: 2.250
3rd Qu.:1.0800
3rd Qu.:2.080
Push : 3778
3rd Qu.: 53.55
3rd Qu.: 20.3000
3rd Qu.: 0.8500
3rd Qu.: 0.040000
3rd Qu.: 1.080
3rd Qu.:2.052
3rd Qu.:2.052
3rd Qu.:0.5851
3rd Qu.:0.5676
3rd Qu.:1.18917
3rd Qu.: 0.12486
3rd Qu.:1
3rd Qu.:1
3rd Qu.: 47.893
3rd Qu.: 48.975
3rd Qu.: 23.8376
3rd Qu.: 24.332
3rd Qu.:10.885
3rd Qu.: 11.3964
3rd Qu.:1.0836
3rd Qu.:1.0782
3rd Qu.: 12.1869
3rd Qu.: 12.4157
3rd Qu.: 6.3529
3rd Qu.: 6.4489
3rd Qu.:10.885
3rd Qu.: 11.3964
3rd Qu.: 7.7735
3rd Qu.: 7.9939
3rd Qu.: 5.561
3rd Qu.: 5.6439
3rd Qu.: 204.54
3rd Qu.: 204.517
3rd Qu.: 50.00
3rd Qu.: 50.00
3rd Qu.: 50.00
3rd Qu.: 50.00
3rd Qu.:1.18917
3rd Qu.:1.18917
3rd Qu.: 11.806
3rd Qu.: 11.782
3rd Qu.: 2.2386
3rd Qu.: 2.2458
3rd Qu.: 50.00
3rd Qu.: 50.00
3rd Qu.: 5.69606
3rd Qu.: 5.8483
3rd Qu.: 0.681064
3rd Qu.: 0.74440
3rd Qu.: 223.1
3rd Qu.: 223.14
3rd Qu.: 53.55
3rd Qu.: 53.55
3rd Qu.: 53.55
3rd Qu.: 53.55
3rd Qu.:1.957
3rd Qu.:1.957
3rd Qu.: 12.458
3rd Qu.: 12.439
3rd Qu.: 1.980
3rd Qu.: 2.021
3rd Qu.: 53.55
3rd Qu.: 53.55
3rd Qu.: 6.160
3rd Qu.: 6.223
3rd Qu.: 0.7218
3rd Qu.: 0.7632
3rd Qu.: 85.152
3rd Qu.: 85.064
3rd Qu.: 20.2800
3rd Qu.: 20.2800
3rd Qu.: 20.2800
3rd Qu.: 20.2800
3rd Qu.: 0.92914
3rd Qu.: 0.92914
3rd Qu.: 4.6300
3rd Qu.: 4.6499
3rd Qu.: 0.5975
3rd Qu.: 0.62808
3rd Qu.: 20.2800
3rd Qu.: 20.2800
3rd Qu.: 2.36233
3rd Qu.: 2.3670
3rd Qu.: 0.27408
3rd Qu.: 0.28592
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.8600
3rd Qu.: 0.8600
3rd Qu.: 0.88
3rd Qu.: 0.90
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
Max. :2015-07-19 19:45:00
Max. :2015-07-19
Max. :2015
FRA D1 : 1256
Max. :1600.00
Max. : 8.250
Max. :3.9000
Max. :4.890
Win :17025
Max. :2992.00
Max. : 1392.0000
Max. : 2.6550
Max. : 3.040000
Max. : 4864.000
Max. :2.052
Max. :2.052
Max. :0.9053
Max. :0.9616
Max. :1.81204
Max. : 0.92314
Max. :1
Max. :1
Max. :1661.906
Max. :1793.534
Max. :812.3153
Max. :876.612
Max. :69.868
Max. :111.6521
Max. :1.1772
Max. :1.1349
Max. :406.4039
Max. :438.5324
Max. :203.4483
Max. :219.4924
Max. :69.868
Max. :111.6521
Max. :48.6334
Max. :72.2019
Max. :34.213
Max. :46.6908
Max. :6702.41
Max. :6702.423
Max. :1600.00
Max. :1600.00
Max. :1600.00
Max. :1600.00
Max. :1.81204
Max. :1.81204
Max. :399.515
Max. :399.587
Max. :99.2718
Max. :99.3804
Max. :1600.00
Max. :1600.00
Max. :204.07878
Max. :225.6129
Max. :26.030094
Max. :31.81324
Max. :12533.5
Max. :12533.53
Max. :2992.00
Max. :2992.00
Max. :2992.00
Max. :2992.00
Max. :2.287
Max. :2.287
Max. :746.925
Max. :746.915
Max. :185.388
Max. :185.372
Max. :2992.00
Max. :2992.00
Max. :337.202
Max. :334.563
Max. :38.0031
Max. :40.2482
Max. : 5831.098
Max. : 5831.108
Max. : 1392.0000
Max. : 1392.0000
Max. : 1392.0000
Max. : 1392.0000
Max. : 0.98481
Max. : 0.98481
Max. : 347.5000
Max. : 347.4951
Max. : 86.2500
Max. : 86.24260
Max. : 1392.0000
Max. : 1392.0000
Max. : 156.88006
Max. : 155.6523
Max. : 17.68057
Max. : 17.40491
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.6500
Max. : 2.6500
Max. : 2.65
Max. : 2.65
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
NA
NA
NA
(Other):32136
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA’s :1348
NA’s :1383
NA’s :10832
NA’s :10297
NA
NA
NA
NA
NA
NA
table 4.3.2.2A : 41055 x 112 : Summary of Stakes reversed Kelly models year 2011~2015.
From above table summary, we can know the range of risk management applicable to various adjusted Kelly models. Now we try to compare the Profit & Loss from below table.
PL of Stakes Reversed based Kelly Models
Stakes of Firm A at Agency A (2011~2015) ($0,000)
table 4.3.2.2B : 19 x 6 : PL of Stakes based reversed Kelly models year 2011~2015.
Mean with Min-Max Range Fractional Models
Due to there has no league risk management profile, here I try to use the mean value of stakes on every single league as the baseline and set the min and max value to simulate 100 times.
table 4.3.2.2C : 19 x 5 : Summary of Stakes reversed Kelly models (mean value of stakes with min-max range as staking adjuster) year 2011~2015.
From above table summary, we can know the range of risk management applicable to various adjusted Kelly models. Now we try to compare the Profit & Loss from below table.
PL of Reversed rEMProbB Kelly Models (Mean with min-max Adjusted Stakes)
Stakes of Firm A at Agency A (2011~2015) ($0,000)
table 4.3.2.2D : 19 x 5 : PL of Stakes reversed Kelly models (mean value of stakes with min-max range as staking adjuster) year 2011~2015.
Mean with sd Range Fractional Models
Due to there has no league risk management profile, here I try to use the mean value of stakes on every single league as the baseline.
table 4.3.2.2E : 19 x 5 : Summary of Stakes reversed Kelly models (mean value of stakes with sd range as staking adjuster) year 2011~2015.
From above table summary, we can know the range of risk management applicable to various adjusted Kelly models. Now we try to compare the Profit & Loss from below table.
PL of Reversed Stakes based Kelly Models (Mean with sd Adjusted Stakes)
Stakes of Firm A at Agency A (2011~2015) ($0,000)
table 4.3.2.2F : 19 x 5 : PL of Stakes reversed Kelly models year (mean value of stakes with sd range as staking adjuster) 2011~2015.
Reversed rEMProbB based Kelly models
Basic Fractional Models
rEMProbB (real EM Probabilities Back) are the application of the parameter from reversion of the stakes where add-on some modified version Kelly models. For the EM probabilities based models, I had just simply adjusted for staking and get the different outcome of Profit & Loss.
Table 4.3.2.3 : Summary Table of Various Kelly Models (reversed rEMProbB based models)
TimeUS
DateUS
Sess
League
Stakes
HCap
HKPrice
EUPrice
Result
Return
PL
PL.R
Rebates
RebatesS
rRates
netEMEdge
netProbB
netProbL
rEMProbB
rEMProbL
weight.stakes
weight
KStakesHKPriceEdge
KStakesnetProbBEdge
KStakesHKPrice
KStakesnetProbB
KStakesFixed
KStakesFixednetProbB
KStakesEMProb
KStakesEMProbnetProbB
KStakesHalf
KStakesHalfnetProbB
KStakesQuarter
KStakesQuarternetProbB
KStakesAdj
KStakesAdjnetProbB
KStakesHalfAdj
KStakesHalfAdjnetProbB
KStakesEMQuarterAdj
KStakesEMQuarterAdjnetProbB
KReturnHKPriceEdge
KReturnnetProbBEdge
KReturnHKPrice
KReturnnetProbB
KReturnFixed
KReturnFixednetProbB
KReturnEMProb
KReturnEMProbnetProbB
KReturnHalf
KReturnHalfnetProbB
KReturnQuarter
KReturnQuarternetProbB
KReturnAdj
KReturnAdjnetProbB
KReturnHalfAdj
KReturnHalfAdjnetProbB
KReturnEMQuarterAdj
KReturnEMQuarterAdjnetProbB
KPLHKPriceEdge
KPLnetProbBEdge
KPLHKPrice
KPLnetProbB
KPLFixed
KPLFixednetProbB
KPLEMProb
KPLEMProbnetProbB
KPLHalf
KPLHalfnetProbB
KPLQuarter
KPLQuarternetProbB
KPLAdj
KPLAdjnetProbB
KPLHalfAdj
KPLHalfAdjnetProbB
KPLEMQuarterAdj
KPLEMQuarterAdjnetProbB
KPLHKPriceEdge.R
KPLnetProbBEdge.R
KPLHKPrice.R
KPLnetProbB.R
KPLFixed.R
KPLFixednetProbB.R
KPLEMProb.R
KPLEMProbnetProbB.R
KPLHalf.R
KPLHalfnetProbB.R
KPLQuarter.R
KPLQuarternetProbB.R
KPLAdj.R
KPLAdjnetProbB.R
KPLHalfAdj.R
KPLHalfAdjnetProbB.R
KPLEMQuarterAdj.R
KPLEMQuarterAdjnetProbB.R
Min. :2011-01-07 14:45:00
Min. :2011-01-07
Min. :2011
ENG PR : 1930
Min. : 0.50
Min. :-3.500
Min. :0.1800
Min. :1.180
Cancelled: 28
Min. : 0.00
Min. :-1600.0000
Min. :-1.0000
Min. :-3.480000
Min. :-5568.000
Min. :2.002
Min. :2.002
Min. :0.0384
Min. :0.0947
Min. :0.07686
Min. :-0.81204
Min. :1
Min. :1
Min. : 0.000
Min. : 0.1201
Min. :0.0000
Min. :0.0400
Min. :0.03101
Min. :0.03754
Min. :0.0000
Min. :0.0400
Min. :0.00000
Min. :3.182e-05
Min. :0
Min. :0
Min. :0.03101
Min. :0.03754
Min. :0.002024
Min. :2.487e-05
Min. :2.152e-05
Min. :2.000e-08
Min. : 0.000
Min. : 0.000
Min. :0.000
Min. :0.000
Min. :0.0000
Min. :0.0000
Min. :0.000
Min. :0.000
Min. :0.00000
Min. :0.000000
Min. :0
Min. :0
Min. :0.0000
Min. :0.0000
Min. :0.00000
Min. :0.00000
Min. :0.00000
Min. :0.00000
Min. :-9.1370
Min. :-12.039
Min. :-3.17026
Min. :-3.94433
Min. :-0.47018
Min. :-0.38763
Min. :-3.17026
Min. :-3.94433
Min. :-0.335131
Min. :-0.0979937
Min. :0
Min. :0
Min. :-0.47018
Min. :-0.38763
Min. :-0.089810
Min. :-0.121450
Min. :-0.0221153
Min. :-0.0463606
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.000
Min. :-1.00000
Min. : NA
Min. : NA
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
Min. :-1.00000
1st Qu.:2012-09-07 15:00:00
1st Qu.:2012-09-07
1st Qu.:2012
FRA D2 : 1526
1st Qu.: 12.50
1st Qu.: 0.000
1st Qu.:0.7800
1st Qu.:1.780
Half Loss: 2798
1st Qu.: 0.00
1st Qu.: -18.0000
1st Qu.:-1.0000
1st Qu.:-0.040000
1st Qu.: -0.780
1st Qu.:2.017
1st Qu.:2.017
1st Qu.:0.4324
1st Qu.:0.4149
1st Qu.:0.87514
1st Qu.:-0.18917
1st Qu.:1
1st Qu.:1
1st Qu.: 2.508
1st Qu.: 2.3791
1st Qu.:0.7606
1st Qu.:0.7816
1st Qu.:0.30702
1st Qu.:0.35604
1st Qu.:0.7606
1st Qu.:0.7816
1st Qu.:0.00000
1st Qu.:7.755e-03
1st Qu.:0
1st Qu.:0
1st Qu.:0.30702
1st Qu.:0.35604
1st Qu.:0.077703
1st Qu.:6.961e-02
1st Qu.:1.768e-02
1st Qu.:1.298e-02
1st Qu.: 0.000
1st Qu.: 0.000
1st Qu.:0.000
1st Qu.:0.000
1st Qu.:0.0000
1st Qu.:0.0000
1st Qu.:0.000
1st Qu.:0.000
1st Qu.:0.00000
1st Qu.:0.000000
1st Qu.:0
1st Qu.:0
1st Qu.:0.0000
1st Qu.:0.0000
1st Qu.:0.00000
1st Qu.:0.00000
1st Qu.:0.00000
1st Qu.:0.00000
1st Qu.:-2.5133
1st Qu.: -2.430
1st Qu.:-0.78214
1st Qu.:-0.79968
1st Qu.:-0.30502
1st Qu.:-0.35943
1st Qu.:-0.78214
1st Qu.:-0.79968
1st Qu.: 0.000000
1st Qu.:-0.0106810
1st Qu.:0
1st Qu.:0
1st Qu.:-0.30502
1st Qu.:-0.35943
1st Qu.:-0.078520
1st Qu.:-0.078308
1st Qu.:-0.0188247
1st Qu.:-0.0178730
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.000
1st Qu.:-1.00000
1st Qu.: NA
1st Qu.: NA
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
1st Qu.:-1.00000
Median :2013-09-21 10:00:00
Median :2013-09-21
Median :2013
GER D1 : 1464
Median : 26.00
Median : 0.750
Median :0.9300
Median :1.930
Half Win : 3052
Median : 16.00
Median : 0.0000
Median : 0.0000
Median : 0.000000
Median : 0.000
Median :2.044
Median :2.044
Median :0.5105
Median :0.4895
Median :1.03411
Median :-0.03411
Median :1
Median :1
Median : 3.293
Median : 3.2566
Median :1.0704
Median :1.0689
Median :0.36589
Median :0.36890
Median :1.0704
Median :1.0689
Median :0.00000
Median :1.609e-02
Median :0
Median :0
Median :0.36589
Median :0.36890
Median :0.083037
Median :9.262e-02
Median :2.016e-02
Median :2.268e-02
Median : 3.805
Median : 3.669
Median :1.115
Median :1.212
Median :0.4304
Median :0.3829
Median :1.115
Median :1.212
Median :0.00000
Median :0.009855
Median :0
Median :0
Median :0.4304
Median :0.3829
Median :0.08676
Median :0.07776
Median :0.02001
Median :0.01451
Median : 0.0000
Median : 0.000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.000000
Median : 0.0000000
Median :0
Median :0
Median : 0.00000
Median : 0.00000
Median : 0.000000
Median : 0.000000
Median : 0.0000000
Median : 0.0000000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.365
Median : 0.00000
Median : NA
Median : NA
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Median : 0.00000
Mean :2013-08-06 03:16:01
Mean :2013-08-05
Mean :2013
ITA D1 : 1412
Mean : 40.53
Mean : 1.075
Mean :0.9443
Mean :1.944
Loss :14374
Mean : 41.40
Mean : 0.8713
Mean : 0.0332
Mean : 0.001888
Mean : -0.123
Mean :2.033
Mean :2.033
Mean :0.5056
Mean :0.4944
Mean :1.02799
Mean :-0.02799
Mean :1
Mean :1
Mean : 3.413
Mean : 3.5059
Mean :1.1122
Mean :1.1558
Mean :0.35669
Mean :0.36396
Mean :1.1122
Mean :1.1558
Mean :0.04257
Mean :1.838e-02
Mean :0
Mean :0
Mean :0.35669
Mean :0.36396
Mean :0.081028
Mean :8.813e-02
Mean :1.894e-02
Mean :2.304e-02
Mean : 3.531
Mean : 3.628
Mean :1.151
Mean :1.196
Mean :0.3688
Mean :0.3761
Mean :1.151
Mean :1.196
Mean :0.04419
Mean :0.019185
Mean :0
Mean :0
Mean :0.3688
Mean :0.3761
Mean :0.08373
Mean :0.09105
Mean :0.01956
Mean :0.02380
Mean : 0.1178
Mean : 0.122
Mean : 0.03837
Mean : 0.04009
Mean : 0.01206
Mean : 0.01218
Mean : 0.03837
Mean : 0.04009
Mean : 0.001619
Mean : 0.0008099
Mean :0
Mean :0
Mean : 0.01206
Mean : 0.01218
Mean : 0.002703
Mean : 0.002913
Mean : 0.0006262
Mean : 0.0007598
Mean : 0.03327
Mean : 0.03322
Mean : 0.03342
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03342
Mean : 0.03322
Mean : 0.035
Mean : 0.03322
Mean :NaN
Mean :NaN
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
Mean : 0.03322
3rd Qu.:2014-09-18 15:05:00
3rd Qu.:2014-09-18
3rd Qu.:2014
SPA D1 : 1331
3rd Qu.: 50.00
3rd Qu.: 2.250
3rd Qu.:1.0800
3rd Qu.:2.080
Push : 3778
3rd Qu.: 53.55
3rd Qu.: 20.3000
3rd Qu.: 0.8500
3rd Qu.: 0.040000
3rd Qu.: 1.080
3rd Qu.:2.052
3rd Qu.:2.052
3rd Qu.:0.5851
3rd Qu.:0.5676
3rd Qu.:1.18917
3rd Qu.: 0.12486
3rd Qu.:1
3rd Qu.:1
3rd Qu.: 4.224
3rd Qu.: 4.4155
3rd Qu.:1.4315
3rd Qu.:1.4608
3rd Qu.:0.41479
3rd Qu.:0.37952
3rd Qu.:1.4315
3rd Qu.:1.4608
3rd Qu.:0.07768
3rd Qu.:2.670e-02
3rd Qu.:0
3rd Qu.:0
3rd Qu.:0.41479
3rd Qu.:0.37952
3rd Qu.:0.086236
3rd Qu.:1.099e-01
3rd Qu.:2.104e-02
3rd Qu.:3.260e-02
3rd Qu.: 6.271
3rd Qu.: 6.278
3rd Qu.:2.052
3rd Qu.:2.071
3rd Qu.:0.6886
3rd Qu.:0.7006
3rd Qu.:2.052
3rd Qu.:2.071
3rd Qu.:0.05820
3rd Qu.:0.034649
3rd Qu.:0
3rd Qu.:0
3rd Qu.:0.6886
3rd Qu.:0.7006
3rd Qu.:0.15889
3rd Qu.:0.16068
3rd Qu.:0.03590
3rd Qu.:0.03924
3rd Qu.: 2.9827
3rd Qu.: 2.901
3rd Qu.: 0.95674
3rd Qu.: 0.95740
3rd Qu.: 0.32312
3rd Qu.: 0.32301
3rd Qu.: 0.95674
3rd Qu.: 0.95740
3rd Qu.: 0.005910
3rd Qu.: 0.0148883
3rd Qu.:0
3rd Qu.:0
3rd Qu.: 0.32312
3rd Qu.: 0.32301
3rd Qu.: 0.074592
3rd Qu.: 0.071302
3rd Qu.: 0.0165572
3rd Qu.: 0.0168291
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.760
3rd Qu.: 0.85000
3rd Qu.: NA
3rd Qu.: NA
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
3rd Qu.: 0.85000
Max. :2015-07-19 19:45:00
Max. :2015-07-19
Max. :2015
FRA D1 : 1256
Max. :1600.00
Max. : 8.250
Max. :3.9000
Max. :4.890
Win :17025
Max. :2992.00
Max. : 1392.0000
Max. : 2.6550
Max. : 3.040000
Max. : 4864.000
Max. :2.052
Max. :2.052
Max. :0.9053
Max. :0.9616
Max. :1.81204
Max. : 0.92314
Max. :1
Max. :1
Max. :18.221
Max. :28.7399
Max. :6.3234
Max. :9.5749
Max. :0.47018
Max. :0.38763
Max. :6.3234
Max. :9.5749
Max. :0.44300
Max. :1.535e-01
Max. :0
Max. :0
Max. :0.47018
Max. :0.38763
Max. :0.089810
Max. :1.214e-01
Max. :2.212e-02
Max. :4.636e-02
Max. :18.221
Max. :28.740
Max. :6.323
Max. :9.575
Max. :0.7735
Max. :0.8016
Max. :6.323
Max. :9.575
Max. :0.48620
Max. :0.153497
Max. :0
Max. :0
Max. :0.7735
Max. :0.8016
Max. :0.17384
Max. :0.30363
Max. :0.04928
Max. :0.12847
Max. : 3.7786
Max. : 4.863
Max. : 1.31621
Max. : 1.59310
Max. : 0.35309
Max. : 0.46989
Max. : 1.31621
Max. : 1.59310
Max. : 0.146565
Max. : 0.0395793
Max. :0
Max. :0
Max. : 0.35309
Max. : 0.46989
Max. : 0.093727
Max. : 0.196716
Max. : 0.0304529
Max. : 0.0849179
Max. : 2.65000
Max. : 2.65000
Max. : 1.95000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 1.95000
Max. : 2.65000
Max. : 0.970
Max. : 2.65000
Max. : NA
Max. : NA
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
Max. : 2.65000
NA
NA
NA
(Other):32136
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA’s :3
NA
NA’s :26
NA
NA
NA
NA’s :26
NA
NA’s :20781
NA
NA’s :41055
NA’s :41055
NA
NA
NA
NA
NA
NA
table 4.3.2.3A : 41055 x 94 : Summary of Reversed rEMProbB Kelly models year 2011~2015.
From above table summary, we can know the range of risk management applicable to various adjusted Kelly models. Now we try to compare the Profit & Loss from below table.
PL of Reversed rEMProbB Kelly Models
Stakes of Firm A at Agency A (2011~2015) ($0,000)
table 4.3.2.3B : 19 x 5 : PL of Reversed rEMProbB Kelly models year 2011~2015.
Mean with Min-Max Range Fractional Models
Due to there has no league risk management profile, here I try to use the mean value of stakes on every single league as the baseline and set the min and max value to simulate 100 times.
table 4.3.2.3C : 19 x 5 : Summary of Reversed rEMProbB Kelly models year (mean value of stakes with min-max range as staking adjuster) 2011~2015.
From above table summary, we can know the range of risk management applicable to various adjusted Kelly models. Now we try to compare the Profit & Loss from below table.
PL of Reversed rEMProbB Kelly Models (Mean with Min-Max Adjusted Stakes)
Stakes of Firm A at Agency A (2011~2015) ($0,000)
table 4.3.2.3D : 19 x 5 : PL of Reversed rEMProbB Kelly models (mean value of stakes with min-max range as staking adjuster) year 2011~2015.
Mean with sd Range Fractional Models
Due to there has no league risk management profile, here I try to use the median value of stakes on every single league as the baseline.
table 4.3.2.3E : 19 x 5 : Summary of Reversed rEMProbB Kelly models year (mean value of stakes with sd range as staking adjuster) year 2011~2015.
From above table summary, we can know the range of risk management applicable to various adjusted Kelly models. Now we try to compare the Profit & Loss from below table.
PL of Reversed rEMProbB Kelly Models (Mean with sd Adjusted Stakes)
Stakes of Firm A at Agency A (2011~2015) ($0,000)
table 4.3.2.3F : 19 x 5 : PL of Reversed rEMProbB Kelly models (mean value of stakes with sd range as staking adjuster) year 2011~2015.
4.3.3 Weighted Fractional Kelly Ⓜodels
In previous section I measure the data from 2011~2015 as static analysis. Well, now I try to seperates as annum data base and get the optimal weight value for next year use. Due to I dont know if the weight function is needed for staking models since sports consultancy firm had applied Poison models weith weight function. As we know a dice will have \(\frac{1}{6}\) chance to open one of the outcome, however theoretical probabilities doesn’t not correct since there have papers which applied bernoulli distribution to test the outcome with a certain iteration, the outcome might be 0.499 and 0.501 for over under but not 0.5 for each, that is due to the some effect like the balance of the dice, the flat level of the table, the wind, momentum and etc. I don’t pretand to know and only simulate it by obseravation to guess the optimal value.
Due to fractional Kelly model is independent models (for example : half-Kelly will be half-Kelly staking model, and full-Kelly will be only full-Kelly model across the years as we made comparison in section [4.3.2 Fractional Kelly odel].), now we need to make it weighted fractional model. Similar with my prevous Rmodel which applied on Poisson model. Due to the calculation of the settlement and result on the win and loss of Asian Handicap is different with Fixed odds, the probabilities of the outcome will be descrete and the measurement of the likelihood result required in order to maximize the profit. here we need to add an additional parameter controller to adjust the staking amount on every single match.
Now I try to will simulate an enhanced Kelly model on staking which take the effect of the outcome of the result into calculation below controller parameter \(\phi(r)\) fit into \(equation\ 4.3.3.1\) to control the leverage ratio.31 similar theory apply on investment portfolio while it might turn to be nested controller parameters across different soccer leagues.
\[\phi(r) = exp(w_{i}\rho_{i}) \cdots equation\ 4.3.3.1\] Where \(X = x_{i,2,3...n}\) is the original staking amount by Kelly model. Meanwhile, the \(r\) value is the optimal parameter controller for staking.
Here I try to diversified the weight parameters on equation 4.3.3.2. The first year data will be the baseline for further years analysis. The \(\phi(r)\) function is a contant variable which using previous years’s data to apply on current year staking model. You can also use other method to find yours.
Due to we unable foreseen the result before a soccer match started, here I tried to categorise from -1, 0.75, -0.5… 1 as a set of handicap.
table 4.3.3.1 : sample data of weighted handicap
Weighted Value Estimation for Weighted Kelly Models
Weighted Table (2011~2015)
table 4.3.3.1 : 5 x 8 : The static weight parameter from year 2011~2015.
Above table 4.3.3.132table 4.3.3.1 is wrong due to we cannot foreseen the result of a soccer match before kick-off, here I rewrote two kind of weight functions for handicap parameters just a static weighted parameter or constant across a year, you can simulate by apply the Epectation Maximization to get a dynamic vector of weight parameters across the soccer matches. The later section will conduct a monte carlo simulation from 2011 until 2015 to get the best fit outcome.
Stakes based reversed Kelly models
Weighted Fractional Models
Stakes based reversed Kelly models are the application of the parameter from reversion of the stakes where add-on some modified version Kelly models. I tried to adjust by add a constant weight value theta to get the outcome of PL result.
PL of Stakes Reversed based Kelly Models
Stakes of Firm A at Agency A (2012~2015) ($0,000)
table 4.3.3.2A : 19 x 27 : Summary of Stakes reversed weighted 1 Kelly models 1 year 2012~2015.
Now we try to look at a vector dres weighted values.
PL of Stakes Reversed based Kelly Models
Stakes of Firm A at Agency A (2012~2015) ($0,000)
table 4.3.3.2B : 19 x 27 : Summary of Stakes reversed weighted 2 Kelly models 2 year 2012~2015.
Reversed rEMProbB based Kelly models
Weighted Fractional Models
rEMProbB (real EM Probabilities Back) are the application of the parameter from reversion of the stakes where add-on some modified version Kelly models. For the EM probabilities based models, I had just simply adjusted by added a constant theta to get the different outcome of Profit & Loss.
PL of Reversed Prob Kelly Models
Stakes of Firm A at Agency A (2012~2015) ($0,000)
table 4.3.3.3A : 19 x 11 : Summary of Reversed rEMProbB weighted 1 Kelly models 2 year 2012~2015.
Now we try to look at a vector dres weighted values.
PL of Reversed Prob Kelly Models
Stakes of Firm A at Agency A (2012~2015) ($0,000)
table 4.3.3.3B : 19 x 11 : PL of Reversed rEMProbB weighted 2 Kelly models 2 year 2012~2015.
4.3.4 Dynamic Fractional Kelly Ⓜodel
Comparison
Due to the weighted models only analyse on year 2012~2015, here I need to summarise the static and weigthed data to do comparison from the profit and loss prior to further section.
table 4.3.4.1A : 19 x 27 : PL of Reversed stakes dynamic 1 Kelly models 1 year 2012~2015.
PL of Stakes Reversed based Kelly Models
Stakes of Firm A at Agency A (2012~2015) ($0,000)
table 4.3.4.1B : 19 x 27 : PL of Reversed stakes dynamic 2 Kelly models 1 year 2012~2015.
Reversed rEMProbB based Kelly models
PL of Reversed Prob Kelly Models
Stakes of Firm A at Agency A (2012~2015) ($0,000)
table 4.3.4.2A : 19 x 11 : PL of Reversed rEMProbB dynamic 1 Kelly models 2 year 2012~2015.
PL of Reversed Prob Kelly Models
Stakes of Firm A at Agency A (2012~2015) ($0,000)
table 4.3.4.2B : 19 x 11 : PL of Reversed rEMProbB dynamic 2 Kelly models 2 year 2012~2015.
4.3.5 Bank Roll
There has few points we need to consider, there the we need to retrieve the initial investment capital \(BR\):
risk averse from ruin33 The daily lost cannot over the investment fund, otherwise will be bankrupt before growth, Kelly model is sort of risky at the beginning period but bacame stable as time goes by.
Initial invested capital
The differences of time zone (British based sports consultancy firm A)
The financial settlement time of Asian operators (daily financial settlement time 12:00PM Hong Kongnese GMT+8 Credit market with rebates)34 Here I follow the kick-off time from GMT+8 1200 until next morning 1159 (or the American Timezone which is GMT - 4) considered as a soccer betting date. Re-categorise the soccer financial settlement date. Due to I have no the history matches dataset from bookmakers. The scrapped spbo time is not stable (always change, moreover there just an information website) where firm A is the firm who placed bets with millions HKD (although the kick-off time might also changed after placed that particular bet), therefore I follow the kick-off time of the firm A.
graph 4.3.5.1 : Sample data of bank roll and fund growth for basic Kelly model. ($1 = $10,000)
Above Graph is a basic Kelly model, we can know the initial fund size is not united.
Due to our bank roll cannot be less than 0, otherwie will be ruined. Therefore I added the initial balance of the account from the min value of variable SPL which is the balance before place bets must be more than 0. Otherwise unable to place bets. 4.5.1 Risk Management will united the initial fund size and Kelly portion across the league profiles.
The file BankRoll.csv states the profit and loss of the staking. You can see the
As I mentioned at the begining of the research paper, the stakes only reflects the profit and loss of agency A but not firm A. Firm A might have deal with 10~50 or even more agencies and the data from year 2011 is not the initial investment year. You are feel free to download the file. We will discuss the inventory management to reduce the risk.
graph 4.3.5.1 has Event label to marking a specific event on a specifi date or time while I just leave it and only mark on high volatily event dates. From BankRoll.csv we observe the end of soccer sesson in May 2011 dat %>% filter(DateUS >= '2011-05-14' & DateUS <= '2011-05-21') has a seriourly crash. We can investigate more details about the loss matches from the data (or filter the range of the bets in the data table inside Part I).
Comparison of Summarized Kelly Investment Funds
table 4.3.5.2 : Summary of 110 Kelly main funds.
From table 4.5.1.1 we can know the risk of the all investmental funds. You are feel free to browse over KellyApps for more details. You can also refer to Faster Way of Calculating Rolling Realized Volatility in R to measure the volatility of the fund as here I omit it at this stage.
shinyapp 4.3.5.1 : Kelly sportsbook investment fund. Kindly click on KellyApps35 The shinyApp contain both basic fund management and also portfolio management which is in later section 4.5.1 Risk Management. to use the ShinyApp.
4.4 Poisson Ⓜodel
4.4.1 Niko Marttinen (2001)
Data has been collected over the last four seasons in the English Premier League. These include 1997-1998, 1998-1999, 1999-2000 and 2000-2001 seasons. We have also collected the season 2000-2001 data from the main European football betting leagues, such as English Division 1, Division 2 Division 3, Italian Serie A, German Bundesliga and Spanish Primera Liga…
quote 4.4.1.1 : the dataset for the studies (source : Niko Marttinen (2001)).
figure 4.4.1.2 : Comparison of various mixed Poison models II (source : Niko Marttinen (2001)).
figure 4.4.1.3 : Comparison of various mixed Poison models III (source : Niko Marttinen (2001)).
From above models, the author list the models and states that even though pick the worst model among the models still more accurate than bookmaker while E(Score)&Dep&Weighted is the best.
figure 4.4.1.4 : Comparison of various odds modelling models (source : Niko Marttinen (2001)).
Besides, Niko Marttinen (2001) not only choose Poison model throughly as the odds modelling model but also compare to below models :-
ELO ratings.
multinomial ordered probit model.
He concludes that the multinomial ordered probit model is the best fit model but the software for fitting is not generally available. Meanwhile, the Poisson model is more versatile than probit logit model based on the dataset accross the European soccer leagues.38 There has a lot of papers with regard to application of logit probit models on soccer betting, might read through and made comparison with my ®Model®γσ, Eng Lian Hu (2016). I used to read though the logit probit and there has a complicated parameters setting for various effects like : wheather, players’ condition, couch, pitch condition and even though the distance travel and the players’ stamina modelling.
You can read for more details from paper 4.3.1.1 : Niko Marttinen (2001).
4.4.2 Dixon and Coles (1996)
Here we introduce the Dixon and Coles (1996) model and its codes. You are freely learning from below links if interest.
Due to the soccer matches randomly getting from different leagues, and also not Bernoulli win-lose result but half win-lose etc as we see from above. Besides, there were mixed Pre-Games and also In-Play soccer matches and I try to filter-up the sample data to be only English soccer leagues as shinyApps. I don’t pretend to know the correct answer or the model from firm A. However I take a sample presentation Robert Johnson (2011)39 Kindly refer to 23th paper in 7.4 References from one of consultancy firm which is Dixon-Coles model and omitted the scoring process section.
4.4.3 ®γσ, Eng Lian Hu (2016)
Below is my previous research paper which was more sophiscated than Dixon-Coles model. You can refer it and I will just omit the section as mentioned at the beginning section of this staking validation research paper.
Here I cannot reverse computing from barely \(\rho_i^{EM}\) without know the \(\lambda_{ij}\) and \(\gamma\) values. Meanwhile, the staked matches is a descrete random soccer teams accross all leagues and tournaments. Therefore I just simply use reverse EM probabilities by mean value of edge in previous section Kelly.
In order to minimzie the risk, I tried to validate the odds price range invested by firm A.40 As I used to work in AS3388 which always take bets from Starlizard where they only placed bets within the odds price range from 0.70 ~ -0.70. They are not placed bets on all odds price in same edge. The sportbook consulatancy firms might probably not place same amount of stakes on same edge, lets take example as below :-
\(Odds_{em}\) = 0.40 while \(Odds_{BK}\) = 0.50, The edge to firm will be 0.5 ÷ 0.4 = 1.25
\(Odds_{em}\) = 0.64 while \(Odds_{BK}\) = 0.80, The edge to firm will be 0.8 ÷ 0.64 = 1.25
We know above edge is same but due to the probability of occurance an event/goal at 0.4 is smaller than 0.64. In 4.3.3 Weighted Fractional Kelly Ⓜodels I tried to use a weight function to measure the effect of Win-All, Win-Half, Push, Loss-Half, Loss and also Cancelled and there has generated a higher profit.
Again, I don’t pretend to know the correct models but here I try to simulate the occurance of the combination and independent handicaps by testing below distribution.
categorise handicap set (from start of previous session until latest soccer matches as Weight function which has talked in 4.3.3 Weighted Fractional Kelly Ⓜodels)
figure 4.4.4.3 : Diagram of network among probability distributions
From above diagram we can know the network and relationship among probability distributions. Besides, we can try to refer to below probability distribution listing attach with R codes and examples for pratice :
figure 4.4.4.4 : R functions in probability distributions
Here I try to bootstrap/resampling the scores of matches of the dataset to test the Kelly model and get the mean/likelihood value. Boostrapping the scores and staking model will be falling in the following sections [4.5 Staking Ⓜodel and Ⓜoney Management] and 4.6 Expectation Ⓜaximization and Staking Simulation.
4.5 Staking Ⓜodel and Ⓜoney Ⓜanagement
4.5.1 Risk Management
graph 4.5.1.1 : Sample data of candle stick chart for fund growth of firm A via agent A. ($1 = $10,000)
Above table shows the return rates of investment fund from firm A via agent A. We know the initial unvestment fund from $47,788,740.00 growth to $405,511,999.00 within year 2011-01-07, 2015-07-19 with a high return rates 848.55%. Well, I try to apply Kelly model for stakes management which is in previous section 4.3 Kelly Ⓜodel.
From table 4.3.5.2 we can know the risk of the all investmental funds. You are feel free to browse over KellyApps41shinyapp 4.3.5.1 for more details.
In order to equalise the initial fund size, here I united it as $22,100.00 which is get from max value among the 110 funds.
From above shinyApps, we can know the initial fund required and the risk as well as the return of investment for us to follow the firm A with application of a time series onto the Kelly staking models.
I had built and test 110 Kelly main funds (split to indipedent fund will be altogether 6,194 funds) but now need to set a baseline for every single leagues and simulate again the steps in 4.3 Kelly Ⓜodel to made a comparison based on the portion of stakes from the initial pools for every single league as we can refer to table 4.3.2.1.
League Stakes Profiling
Staking allocation and portfolio ($0,000)
table 4.5.1.2 : 177 x 3 : A simple new league stakes profile.
table 4.5.1.2 is just a simple baseline profile for league risk management to test the Kelly models. Here I try to adjust few points to do comparison :
united the initial fund size
set a baseline staking portfolio acrosss the leagues
application of various Kelly but not reversed models
Martin Spann and Bernd Skiera (2009)44 Kindly refer to 19th paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References applied a basic probability sets on the draw games and also the portion of win and loss. The author simply measured the portion of the draw result with win/loss to get the edge to place a bet. However it made a loss on Italian operator Oddset due to the 25% high vigorish but profitable in 12%. Secondly, the bets placed on fixed odds but not Asian Handicap and also a fixed amount $100.
video 4.5.2.1 : Calculated Bets - computers, gambling, mathematical modeling to Win (part 4 of 4)
table 4.3.2.1 shows a risk portfolio for every single league which is roughly similar with Parimutuel Betting but a portion among the initial fund required. In order to equalise the intial fund size. next section I will united it and also application of resampling method to get the optimal league risk profile.
Kelly model will be a good risk averse investment portfolio. As we know normally a mutual fund or any investment fund will advise investors credit a certain money into the pool regularly. For this section I keep it as the next study which is in Application of Kelly Criterion model in Sportsbook Investment - Part II as an dynamic staking baseline model upon injection of new fund into the pool. It will includes the :
bonus issue or dividends
refill or pump-in money into the pool
fund management and admin fees
equation 4.5.4.1 : Economic Order Quantity (EOQ)
Base on above euqation, there has some criteria as below :
\(C\) is the total cycle-inventory cost per annum which is the invest or pump in figure into the investment pool.
\(Q\) is the fund size which pump into the investment pool. (For example: normally the investment fund or insurance company will advise the investors regularly credit a certain money into whose investment account.)
\(H = 1\) due to there has no holding cost per annum unless there is inactive account which will be charges a certain amount of the administration fee where it is not apply to active players.
\(D\) is the betting stakes per annum.
\(S = 1\) due to there has no setup costs per lot. (unless we count in the bank charges, for example : western union, Entropay, bank transfer fee, etc)
Due to there has negative lambda values which unable proceed a scores based on basic bivariate normal distribution. Here I try to build few models in order to get the best fit model for soccer scores resampling.
1st bivariate normal distribution
adjust all negative values to be 0
measure the mean of negative values and add to all positive values as average.
2nd bivariate normal distribution
adjust all negative values to be 0
measure the mean of negative values and substrated by all positive values as average.
multiply all values by the portion of baseline which is the raw dataset.
3rd bivariate normal distribution
adjust all negative values to be 0
apply nested normal distribution on the mean and sd of all negative values and distribute to all positive values as average in rnorm.
apply looping to get the likehood values.
4th bivariate normal distribution
apply truncated adjustment with set a set of lower and upper values.
set min as lower interval and max as upper interval.
multiply all values by the portion of baseline which is the raw dataset.
5th bivariate normal distribution
apply truncated adjustment with set a set of lower and upper values.
manually set min as lower interval and 3, 2.347 3 goals for home team and 2.3 goals for away team as upper interval.
6th bivariate normal distribution
apply truncated adjustment with set a set of lower and upper values.
manually set 1, 048 1 goal for home team and 0 goal for away team as lower interval and 2, 2.349 3 goals for home team and 2.3 goals for away team as upper interval.
7th bivariate normal distribution
apply truncated adjustment with set a set of lower and upper values.
set min values as lower interval and max as upper interval.
multiply all values by the portion of baseline which is the raw dataset.
8th bivariate normal distribution
apply truncated adjustment with set a set of lower and upper values.
set min values as lower interval and mean values as upper interval.
apply looping to get the likehood values by stepwise adding 0.0001 to upper interval.
9th bivariate normal distribution
apply truncated adjustment with set a set of lower and upper values.
set 1st quantile values as lower interval and 3rd quantile values as upper interval.
apply looping to get the likehood values by stepwise exoanding 0.0001 to both lower and upper interval.
10th bivariate normal distribution
apply truncated adjustment with set a set of lower and upper values.
set mean values as lower interval and 3rd quantile values as upper interval.
apply looping to get the likehood values by stepwise exoanding 0.0001 to both lower and upper interval.
graph 4.6.1.1A : comparison of random scoring models.
graph 4.6.1.1B : comparison of random scoring models.
graph 4.6.1.1C : comparison of random scoring models.
graph 4.6.1.1D : comparison of random scoring models.
graph 4.6.1.1E : comparison of random scoring models.
graph 4.6.1.1F : comparison of random scoring models.
graph 4.6.1.1G : comparison of random scoring models.
graph 4.6.1.1H : comparison of random scoring models.
graph 4.6.1.1I : comparison of random scoring models.
graph 4.6.1.1J : comparison of random scoring models.
Truncated Bivariate Normal Distribution
Mean values and comparison among randomize bivariate scoring models
table 4.5.2.1A : 22 x 4 : Comparison of truncated bivariate normal distributions.
Truncated Bivariate Normal Distribution
variance among randomize bivariate scoring models
table 4.5.2.1B : 22 x 5 : Comparison of truncated bivariate normal distributions.
Truncated Bivariate Normal Distribution
summary among randomize bivariate scoring models
table 4.5.2.1C : 22 x 5 : Comparison of truncated bivariate normal distributions.
table 4.5.2.1D : 11 x 12 : Comparison of truncated bivariate normal distributions.
From above models, we know that opt10 is the best fit model and I’ll apply it in further section 4.6.2 Resampling Scores and Stakes. You are feel free to refer to below articles for further understanding:
Now we look at abpve function from a different perspective by considering the observed values \(x_{1}, x_{2}, x_{3}… x_{n}\) to be fixed parameters of this function, whereas \(\rho\) will be the function’s variable and allowed to vary freely; this function will be called the likelihood.
## Warning in instance$preRenderHook(instance): It seems your data is too
## big for client-side DataTables. You may consider server-side processing:
## http://rstudio.github.io/DT/server.html
table 5.1.1.1 : 2282 x 119 : Comparison of Kelly investmemt fund.
5.1.2 Comparison of Missing Bets
Due to I have no followed bets data, therefore there has no any clue but based on my previous experience in Telebiz, Caspo and Global Solution Sdn Bhd as list in my CV I try to adjust setting of 90% successful following rates and 90% of average odds compare to firm A.
According to above result, in order to made it workable in real life. Here I simulate whole Kelly function with monte carlo method to get the return.
Here I skip this sub-section due to no following bets dataset and somemore there is a profit sharing investment business.
Equipped with this knowledge, let’s see what products tend to compliment each other with high lift (i.e. purchase of one product would lead to purchase of another with high probability) and what products tend to be substitutes:
## Total Goals - under Total Goals - over Leicester
## Total Goals - under NA 0.1068912 0.0000000
## Total Goals - over 0.10689122 NA 0.3859765
## Leicester 0.00000000 0.3859765 NA
## Liverpool 0.09666446 0.7522156 0.0000000
## Guingamp 0.19782493 0.5497921 0.0000000
## Liverpool Guingamp
## Total Goals - under 0.09666446 0.1978249
## Total Goals - over 0.75221561 0.5497921
## Leicester 0.00000000 0.0000000
## Liverpool NA 0.0000000
## Guingamp 0.00000000 NA
## Warning in apriori(trans3, parameter = list(support = 0.001, minlen =
## 2, : Mining stopped (maxlen reached). Only patterns up to a length of 2
## returned!
Due to the data-sets I collected just one among all agents among couple sports-bookmakers 4lowin. Here I cannot determine if the sample data among the population…
JA : What skills and academic training (example: college courses) are valuable to sports statisticians?
KW : I would say there are three sets of skills you need to be a successful sports statistician:
Quantitative skills - the statistical and mathematical techniques you’ll use to make sense of the data. Most kinds of coursework you’d find in an applied statistics program will be helpful. Regression methods, hypothesis testing, confidence intervals, inference, probability, ANOVA, multivariate analysis, linear and logistic models, clustering, time series, and data mining/machine learning would all be applicable. I’d include in this category designing charts, graphs, and other data visualizations to help present and communicate results.
Technical skills - learning one or more statistical software systems such as R/S-PLUS, SAS, SPSS, Stata, Matlab, etc. will give you the tools to apply quantitative skills in practice. Beyond that, the more self-reliant you are at extracting and manipulating your data directly, the more quickly you can explore your data and test ideas. So being adept with the technology you’re likely to encounter will help tremendously. Most of the information you’d be dealing with in sports statistics would be in a database, so learning SQL or another query language is important. In addition, mastering advanced spreadsheet skills such as pivot tables, macros, scripting, and chart customization would be useful.
Domain knowledge - truly understanding the sport you want to analyze professionally is critical to being successful. Knowing the rules of the game; studying how front offices operate; finding out how players are recruited, developed, and evaluated; and even just learning the jargon used within the industry will help you integrate into the organization. You’ll come to understand what problems are important to the GM and other decisionmakers, as well as what information is available, how it’s collected, what it means, and what its limitations are. Also, I recommend keeping up with the discussions in your sport’s analytic community so you know about the latest developments and what’s considered the state of the art in the public sphere. One of the great things about being a sports statistician is getting to follow your favorite websites and blogs as a legitimate part of your job!
In this Part II research paper I try to add a section which is filtered out only English soccer leagues and the revenue and profit & loss all sessional based but not annum based to make it applicable to my future staking in real world. The proportional staking and also money management on the staking pools. You are feel free to browse from the content page Betting Strategy and Model Validation.
Journey to West
The statistical analysis on sportsbook and soccereconomics is popular in US, Europe and Ocean Pacific but not yet in Asia. Here I am learning from the western professional sportsbook consultancy firms and do shares with those who like scientific analysis on soccer sports.
If you get interest to be a punter, you are feel free to read over below presentation paper from a British consultancy firm to know the requirement to be a professional gambler.
Mark Dixon
graph 6.1.1 : The pioneer of sportsbook statistical analytical ATASS — founder : Mark Dixon
Niko Marttinen (2001) has conducted a very detail and useful but also applicable betting system in real life. There has a ordered probit model which shows a high accuracy predictive model compare to his Poisson (Escore) model. Well, the ®γσ, Lian Hu ENG (2016)52 The research modelling with testing the efficiency of odds price which had completed in year 2010. Kindly refer to 3rd paper in Reference for industry knowdelege and academic research portion for the paper. under 7.4 References has build a weight inflated diagonal poisson model which is more complicated and shophitiscated and later ®γσ, Lian Hu ENG (2014)53 Kindly refer to 4th paper inside Reference for industry knowdelege and academic research portion for the paper. under 7.4 References. However there has an automatically and systematically trading system which wrote in VBA + S-Plus + Excel + SQL54 the betting system has stated in his paper. which is very useful as reference. The author use VBA to automac the algorithmic trading while there has no Asian Handicap and Goal Line odds price data to simulate compare to mine. While currently the shinyapps with RStudioConnect can also build an algorithmic trading system. However the session timeout issue55 The connection timeout issue might be a big issue for real time algorithmic trading might need to consider. The shinydashboard example from ョStudio might probably cope with the issue.
John Fingleton & Patrick Waldron (1999) applied Shin model to test the portion of hedge funds and smart punters. As I stated in 4.2 Linear Ⓜodel, the sparkR, RHadoop and noSQL require in order to analyse the high volume betslips dataset. Its interesting and will conduct the research if all betslips of bookmaker(s) is(are) available in the future.
From the 4.3 Kelly Ⓜodel we test the staking model, the table 4.2.1 we apply the linear models and choose the best fit model based on the edge of odds price. 4.4 Poisson Ⓜodel we try to reverse the odds price placed to get the probabilities of scoring different scores. Now we try to test the return of staking on different handicap (ex: 0, 0.25, 0.5, 0.75, 1 etc.) to know which handicap earn the most. Nowadays the hotest matches of four major leagues provides few handicaps market, there will be another case study and research to increase the profit base on same probabilities and also edge but staking on different handicap. The dataset will be collect for research beyond the future. The effects of Win-Half and Loss-Half might probably more effective by application of Poison models since it is a descrete outcome while I take it as a known linear effects this time due to the odds price of Handicap we placed always within a range from 0.7 to 1.25.
I will be apply Shiny to write a dynamic website to utilise the function as web based apps. I am currently conducting another research on Analyse the Finance and Stocks Price of Bookmakers which is an analysis on the public listed companies and also anonymous companies revenue and profit & loss. You are welcome to refer SHOW ME SHINY and build your own shinyapps.
I will also write as a package to easier load and log.
Reviewed previous version, DT::datatable updated new version replaced Button extension from TableTools, removed sparkline and htmlwidget
Applied linear regression to test the efficiency of staking model by consultancy firm A
File pre-release version 0.9.4 - 2016-09-28 00:15:24 JST
Added linear regression and shinyApp to test the effects on staking
File pre-release version 0.9.5 - 2016-12-20 02:26:19 JST
Added Kelly criterion for 110 main-funds with each has 19 sub-funds.
7.3 Speech and Blooper
Firstly I do appreciate those who shade me a light on my research. Meanwhile I do happy and learn from the research.
Due to the rmarkdown file has quite some sections and titles, you might expand or collapse the codes by refer to Code Folding and Sections for easier reading.
There are quite some errors when I knit HTML:
let say always stuck (which is not response and consider as completed) at 29%. I tried couple times while sometimes prompt me different errors (upgrade Droplet to larger RAM memory space doesn’t helps) and eventually apply rm() and gc() to remove the object after use and also clear the memory space.
Need to reload the package suppressAll(library('networkD3')) which in chunk decission-tree-A prior to apply function simpleNetwork while I load it in chunk libs at the beginning of the section 1. Otherwise cannot found that particlar function.
xtable always shows LaTeX output but not table. Raised a question in COS : 求助!knitr Rmd pdf 中文编译 2016年8月19日 下午9:56 7 楼.Here I try other packages like textreg and stargazer. You can refer to Test version to view the output of stargazer function and the source codes I reserved but added eval = FALSE in chunks named lm-summary and lm-anova to unexecute the codes.
Remark : When I rewrite Report with ShinyApps : Linear Regression Analysis on Odds Price of Stakes and would like to post to ®StudioConnect, the wizard only allowed me post to rPubs.com (but everyone know rPubs only allow static document which is not effort to support Shinyapp). Therefore kindly refer to https://beta.rstudioconnect.com/content/1766/. You might download and run locally due to web base version always affected by wizards and sometimes only view datatable but sometimes only can view googleVis while sometimes unable access.
The analysis in Part I might slightly different with Part II due to the timezone issue.
The daily financial settlement time is EST 0000 or HKT 1200.
The filtered data for observation in Part II for statistical analysis purpose while Part I is just summarise and breakdown the bets which includes all bets.
I am currently work as a customer service operator and self research as a smart punter. Hope my sportsbook hedge fund company website Scibrokes® running business soon…
Job Task : A customer service executive in Mindpearl but handle Malaysia Airlines customers.
Remarks : Mindpearl is an outsourcing company for few Airlines companies which handle Malaysian Airline customers. Converted from Manpower recruitment agency to a proper outsourcing company which based in Fiji (an island nation in Asia Pacific).
Reason of Leaving : HR Manager Muthu Kuna she say I don’t know how to play black magic and silly follow Malaysian law to take public holidays and fire me. Because she is wizard controlled by HwaCai from No32 Tmn RHU and be will scapegoat if I don’t know black magic. In fact that is wizard’s affairs but not our human’s responsibility to be wizard’s scapegoat.
Job Task : Handling inbound and outbound calls, check and approve/reject customers’ ordered transactions.
Remarks : An outsourcing HR company which insource staffs into Xerox (M) Bhd but operate Apple Inc.’s project.
Resson of Leaving : Manager Robin Wang Yan Ping say I don’t know how to play black magic and fire me. Because he is wizard controlled by HwaCai from No32 Tmn RHU and be will scapegoat if I don’t know black magic. In fact that is wizard’s affairs but not our human’s responsibility to be wizard’s scapegoat.
Job Task : Handling inbound calls from Taiwanese customers’ orders and maintenance.
Remarks : A printer company which provides an IT service to customers and few retailer’s outlet clients (FamilyMart, 7Eleven, HiLife and OKMart ).
Resson of Leaving : Manager Chai Yun Cheah say I don’t know how to play black magic and fire me. Because he is wizard controlled by JiGong from WanFosi and be will scapegoat if I don’t know black magic. In fact that is wizard’s affairs but not our human’s responsibility to be wizard’s scapegoat.
3.5 Sportsbook Trading Team Leader - SBSolutionCorp Ltd
Job Task : Handling few members to update scores and bets settlement.
Remarks : Operation conducting by SB Solutioncorp based in Makati City (Manila, Philippines) but software company GBLinks based in Kowloon (Hong Kong, China)
Resson of Leaving : Manager Mike Chong Yu Meng request to hire more team members to lease the job or need to knnow play black magic, but I just would like to enjoy simple life office worker and don’t know black magic since I am not wizard, therefore choose resign.
3.6 Live Complaince Editor - U-Drive Media Sdn Bhd
Job Task : Watching and filtering video streaming from abroad prior to play in our country.
Remarks : An outsourcing company which insource staffs into TM Bhd (Telekom Malaysia ).
Resson of Leaving : Director request change to account or marketing department, but marketing is not my strength and I just would like to enjoy simple life office worker, therefore choose resign. Director Kent say I don’t know how to play black magic, Chinese need to follow Malay’s instruction since Malaysia is not China nor Taiwan and fire me. Because he is wizard controlled by JiGong from WanFosi and be will scapegoat if I don’t know black magic. In fact that is wizard’s affairs but not our human’s responsibility to be wizard’s scapegoat.
3.7 Sportsbook Customer Service - Scicom (MSC) Bhd
Job Task : Handling inbound and outbound calls, livechat, feedback, all deposit, withdrawal and also phone bets’ transactions. Sometimes need to do translation and also analysis job tasks
Remarks : An outsourcing public listed company which handle Fortune 500 Ladbrokes PLC’s business in Far East Asia.
Resson of Leaving : HR request change project since closed project, since there has no sportsbook related project besides Ladbrokes during that time therefore choose resign.
Job Task : Provides training to newbies on sportsbook trading and live scout. Need to backup for trading as well. Sometimes need to do translation and also analysis job tasks.
Remarks : A cross-global business group which co-orperate with couples of companies whic based in Hong Kong, Taiwan, IOM (United Kingdom), Cambodia etc. Few brands like 188Bet , Crown , etc.
Resson of Leaving : Cannot stand for daily 12 to 16 working hours during that time, therefore choose resign.
Job Task : Provides training to newbies on sportsbook trading purchasing stocks, need to update scores for bets financial settlement. Handle live-matches on corners. Sometimes need to do translation and also analysis job tasks.
Remarks : An outsourcing company which handle sportsbook trading based Malaysia but co-operates with quite some companies in Hongkongese Citics PLC, Australian Betworks Pty and also UK Starlizard Ltd etc.
Resson of Leaving : Looking for better offer which introduced by friends.
Learn PhantomJS for background wbdriver, markdown, setup RStudio server. However RSelenium and PhamtonJS faced some error and then just simply apply rvest to harvest the data from spbo without odds price.
Gather and filtering the staking dataset from Sportsbook consultancy firm A and validate their staking models.
Gather livescore data fromspbolivescore website and filter the data.
Natural language analysis on the scrapped livescore data to match the team names from firm A.
Testing the efficiency of some coding.
Learn modified Kelly staking model.
Learn investment fund management.
graph 4.2.1 : Sample data of candle stick chart for fund growth of firm A via agent A. ($1 = $10,000)
*Apply poisson model to the scrapped data from- **WebDriver Dynamic Webpage Scrapping- to build an prediction model.*
*Rewrite the Kelly model from- **Odds Modelling and Testing Inefficiency of Sports-Bookmakers- in R but also optimal value* \(r\)which is stated inDixon&Coles1996to test the efficiency and the returns of investment based from the Poisson model.
Improve the efficiency of data management and also timing of calculation.
Return of Investment from Kelly Model
Breakdown of Operators - Profit & Loss on the Odds Price with/without Overrounds.
table 4.6.1 : Breakdown of Operators - Profit & Loss on the Odds Price with/without Overrounds.
LearningShinyandknitrin Coursera JHU- 09 Developing Data Products, occasionally noticed a* fbRankspackage, simply scrape the livescore of English Premier League to test the Dixon-Coles model.
Similar with my previous research, the author wrote a scrape the data statical webpage but save as csv, but add-on a surface/venue parameter and also apply efficiency of the calculation for bigger size dataset. You can go tohttp://www.lastplanetranking.blogspot.myfor more details.
I’ve modified a bit my model by refer the ideaefficiency of glm packagesof the author and will review again my model and also write my company websiteScibrokes® Test Siteby Shiny when free.
Additional information about the website server, SoccerMetrics, sotdocandmatchOdds.orgmight be good referencens for my future writing website. SotDoc roughly brief about the annual returns ofSmartOddsaround 2.5~3% while their investment returns rate around 7% per annum comapre to most of sportsbookmakers 0.5% or Crown 1.8%. SoccerMetrics using Python to setupSoccermetrics API Python Clientwhile MatchOdds apply Tomcat to manage the data server.
Relenium and RSelenium published, scrape the odds price of sportbookmakers and also livescore data from7MandNowGoalwebsite as well as filter the odds price data.
Learning programming to gather livescore and also odds price data fromGooooal. livescore website.
*From the scrapped result, there has alot of errors and wrong result since the static website result unable changed once the score is updated. There might be some postponed matches and also suspended metches. Moreover dynamic webpage request selenium apps, due to Rpy loss maintenance and rPython,RSelenium are not yet published. Therefore forfeit and learn Python and started the- WebDriver Dynamic Webpage Scrapping.
4.10 Odds Modelling and Testing Inefficiency of Sports-Bookmakers
Learn RExcel, CrystalBall, ModelRisk etc. and choose R open source software to start my research.
Collect the livescore and also 1x2, Asian Handicap, Over Under odds price data of 29 sportsbookmakers manually from500WAN, BET007andNowGoalwebsite and filter the odds price data from 2006 to 2011.
Apply Poisson model in R to test the return of the investment. This research job is the most completed, success and the first research which write the whole odds compilation EM model and data management by refer to thousands of research papers in sportsbook odds modelling after resigned from Caspo Inc.
paper 4.10.1 : ®γσ, Eng Lian Hu (2016)
4.11 Apply Poisson regression on sports odds modelling
Write own Poisson model based on what I learnt fromStuart Doyleduring working in Telebiz Sdn Bhd.
Due to not master in Macro VBA, I just simply recall the concept which was a spreadsheet VBA Excel file fromPaul Judgefor data management I accidentlly deleted few years ago.
Remarks: Stuart and Paul used be Sportsbook consultants to Telebiz and setup own advisor company SportsTrust during 2009/2010.
5. Time Line
6. Skill
Skill and Expertise
Skill and Expertise Rating Level (from newbie 1 to expert 10)
I will wrote a shinyapp cv some other days to easy for categorize and viewing.
You are welcome to own yours by refer to my Personal WP Blog (will be migrated to Jekyll/Hugo some other days). You are welcome to contact me at englianhu@gmail.com.
8. Social Network
Appendices
9. Documenting File Creation
It’s useful to record some information about how your file was created.
8. Social Network